Harnessing the XGBoost Ensemble for Intelligent Prediction and Identification of Factors with a High Impact on Air Quality: A Case Study of Urban Areas in Jakarta Province, Indonesia

Wahyu Wibowo*, Harun Al Azies, Susi A. Wilujeng, Shuzlina Abdul-Rahman

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

This article aims to develop an accurate air quality prediction model to handle Jakarta's air pollution challenges. In this study, data from air quality monitoring stations’ conventional air pollution indexes was employed. In the research phase, data is explored, SMOTE is used to manage imbalances, and XGBoost is used to develop a model with the best parameters. The evaluation stage shows the model’s ability to predict air quality. With an accuracy rate of 99.516%, an F1-score of 99.528%, and a recall rate of 99.509%, the results were very astounding. These performance indicators show the model's exceptional ability to classify and predict air quality levels. Furthermore, this study investigates the significance of various variables in predicting air quality. A thorough evaluation of measures such as weight, gain, total gain, and cover indicators reveals the significance of numerous aspects. Even while SO2 helps predict air quality, the prevalence of PM2.5 on several measures reveals a significant influence. This study contributes to a better understanding of the complicated dynamics of air quality prediction by employing advanced analytical approaches and accurate models. This knowledge is useful in developing targeted solutions to address air pollution issues and promote healthier urban environments.

Original languageEnglish
Title of host publicationLecture Notes on Data Engineering and Communications Technologies
PublisherSpringer Science and Business Media Deutschland GmbH
Pages319-334
Number of pages16
DOIs
Publication statusPublished - 2024

Publication series

NameLecture Notes on Data Engineering and Communications Technologies
Volume191
ISSN (Print)2367-4512
ISSN (Electronic)2367-4520

Keywords

  • Air quality prediction
  • Jakarta air pollution
  • Predictive modeling
  • XGBoost algorithm

Fingerprint

Dive into the research topics of 'Harnessing the XGBoost Ensemble for Intelligent Prediction and Identification of Factors with a High Impact on Air Quality: A Case Study of Urban Areas in Jakarta Province, Indonesia'. Together they form a unique fingerprint.

Cite this