TY - CHAP
T1 - Harnessing the XGBoost Ensemble for Intelligent Prediction and Identification of Factors with a High Impact on Air Quality
T2 - A Case Study of Urban Areas in Jakarta Province, Indonesia
AU - Wibowo, Wahyu
AU - Al Azies, Harun
AU - Wilujeng, Susi A.
AU - Abdul-Rahman, Shuzlina
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2024
Y1 - 2024
N2 - This article aims to develop an accurate air quality prediction model to handle Jakarta's air pollution challenges. In this study, data from air quality monitoring stations’ conventional air pollution indexes was employed. In the research phase, data is explored, SMOTE is used to manage imbalances, and XGBoost is used to develop a model with the best parameters. The evaluation stage shows the model’s ability to predict air quality. With an accuracy rate of 99.516%, an F1-score of 99.528%, and a recall rate of 99.509%, the results were very astounding. These performance indicators show the model's exceptional ability to classify and predict air quality levels. Furthermore, this study investigates the significance of various variables in predicting air quality. A thorough evaluation of measures such as weight, gain, total gain, and cover indicators reveals the significance of numerous aspects. Even while SO2 helps predict air quality, the prevalence of PM2.5 on several measures reveals a significant influence. This study contributes to a better understanding of the complicated dynamics of air quality prediction by employing advanced analytical approaches and accurate models. This knowledge is useful in developing targeted solutions to address air pollution issues and promote healthier urban environments.
AB - This article aims to develop an accurate air quality prediction model to handle Jakarta's air pollution challenges. In this study, data from air quality monitoring stations’ conventional air pollution indexes was employed. In the research phase, data is explored, SMOTE is used to manage imbalances, and XGBoost is used to develop a model with the best parameters. The evaluation stage shows the model’s ability to predict air quality. With an accuracy rate of 99.516%, an F1-score of 99.528%, and a recall rate of 99.509%, the results were very astounding. These performance indicators show the model's exceptional ability to classify and predict air quality levels. Furthermore, this study investigates the significance of various variables in predicting air quality. A thorough evaluation of measures such as weight, gain, total gain, and cover indicators reveals the significance of numerous aspects. Even while SO2 helps predict air quality, the prevalence of PM2.5 on several measures reveals a significant influence. This study contributes to a better understanding of the complicated dynamics of air quality prediction by employing advanced analytical approaches and accurate models. This knowledge is useful in developing targeted solutions to address air pollution issues and promote healthier urban environments.
KW - Air quality prediction
KW - Jakarta air pollution
KW - Predictive modeling
KW - XGBoost algorithm
UR - http://www.scopus.com/inward/record.url?scp=85192770804&partnerID=8YFLogxK
U2 - 10.1007/978-981-97-0293-0_24
DO - 10.1007/978-981-97-0293-0_24
M3 - Chapter
AN - SCOPUS:85192770804
T3 - Lecture Notes on Data Engineering and Communications Technologies
SP - 319
EP - 334
BT - Lecture Notes on Data Engineering and Communications Technologies
PB - Springer Science and Business Media Deutschland GmbH
ER -