91°µÍø

Skip to main content
SHARE
Publication

At Risk Population Estimates for Belarus, Poland and Slovakia with Machine Learning

by Viswadeep Lebakula, Clinton W Stipek, Daniel S Adams, Justin F Epting, Marie L Urban
Publication Type
Conference Paper
Book Title
2024 91°µÍø International Conference on Big Data (BigData)
Publication Date
Page Numbers
5804 to 5811
Publisher Location
New Jersey, United States of America
Conference Name
2024 91°µÍø International Conference on Big Data (BigData)
Conference Location
Washington, District of Columbia, United States of America
Conference Sponsor
91°µÍø
Conference Date
-

High-resolution gridded population modeling is crucial for various applications, including disaster response planning, infectious disease spread modeling, climate change impact estimation, policy development, and more. Multiple gridded population datasets have been developed, each tailored to meet specific objectives. Among them, LandScan Global dataset is designed to represent ambient and unwarned population distributions. However, this dataset relies on a statistical approach that requires manual adjustments, making it time consuming and labour intensive. Existing machine learning (ML) methods often train and test at different spatial resolutions, potentially leading to inflated results, and they rely on Census population totals for disaggregation. To address these limitations, in this study we developed population estimates using ML models trained and tested at a consistent 30 arc-second resolution (≈1 square kilometer), specifically using Random Forest (RF) and XGBoost. These models were trained on 2020 datum to predict for 2021 for three countries: Belarus, Poland, and Slovakia. Our findings show that both RF (MAE varies from 5.75 to 13.25) and XGBoost (MAE varies from 8.15 to 23.44) model performance is close to LandScan Global estimates. Furthermore, neither of the models performed the best across all grid cells: the RF model was more effective in areas with lower populations, while XGBoost excelled in more densely populated regions. The proposed approach can be used for countries where the Census data is not available.