تعهد نامه

نوع مقاله : مقالات پژوهشی

نویسندگان

1 دانشجو ارشد مهندسی بهداشت محیط، عضو کمیته تحقیقات دانشجویی دانشکده بهداشت دانشگاه علوم پزشکی مشهد،مشهد، ایران.

2 دانشیار گروه بهداشت محیط، دانشکده بهداشت، دانشگاه علوم پزشکی مشهد، مشهد، ایران.

3 دانشجو دوره تکمیلی پژوهشی دانشکده بهداشت، عضو کمیته تحقیقات دانشجویی دانشکده بهداشت دانشگاه علوم پزشکی مشهد، مشهد ، ایران.

چکیده

زمینه و هدف: آلودگی هوای شهرهای صنعتی و بزرگ ناشی از وجود آلاینده‌های گوناگونی به‌خصوص ذرات‌معلق با قطر کمتر از 2/5 میکرون است که به هوای شهری وارد ‌می‌شود. پیش‌بینی مکان‌هایی که آلاینده PM2.5 به میزان زیاد است به مدیریت و برنامه‌ریزی صحیح در راستای بهبود کیفیت هوا کمک شایانی خواهد کرد. هدف از این مطالعه پیش‌بینی غلظت PM2.5 با استفاده از چهار مدل غیرخطی هوش مصنوعی مبتنی بر روش یادگیری ماشین است. 

مواد و روش ها: تکنیک‌های یادگیری ماشین مورد استفاده در این مطالعه شامل: ماشین تقویت گرادیان سبک، رگرسیون تقویت گرادیان پیشرفته، جنگل تصادفی و رگرسیون با تقویت گرادیان بود. داده‌های هواشناسی و غلظت ذرات‌معلق برای پیش‌بینی شاخص کیفیت هوا در بازه‌ی زمانی سال‌های 1395 تا 1401 در شهر مشهد جمع‌آوری گردید. 

یافته‌ها: هر چهار مدل‌ یادگیری ماشین در پیش‌بینی غلظت PM2.5 عملکرد بسیار خوبی را نشان دادند و حدود 95 درصد از پیش‌بینی‌های آن‌ها در محدوده‌ی فاکتور غلظت مشاهده‌شده قرار داشت. نتایج این مطالعه نشان می‌دهد که مدل رگرسیون با تقویت گرادیان از میان چهار الگوریتم‌ استفاده شده بر پایه یادگیری ماشین، عملکرد بهتری را نسبت به سایر مدل‌های غیرخطی با معیارهای دقت بالا از جمله ضریب رگرسیون 0/9802، میانگین خطای مطلق 0/54، میانگین خطای مربعات 5/33، ریشه میانگین خطای مربع 2/31 و میانگین درصد خطای مطلق 1/9 % را نشان می‌دهد. 

نتیجه‌گیری: در نتیجه، این مطالعه روشی را برای به‌دست آوردن نتایج پیش‌بینی PM2.5 با دقت بالا با استفاده از هوش مصنوعی مبتنی بر یادگیری ماشین را پیشنهاد می‌کند که برای پایش کیفیت هوا در مقیاس جهانی و بهبود ارزیابی مواجهه حاد در تحقیقات اپیدمی مفید است.

کلیدواژه‌ها

عنوان مقاله [English]

Statistical Analysis and Forecast Modeling of PM2.5 Concentration Using Artificial Intelligence Based on Machine Learning in Mashhad (2016-2022)

نویسندگان [English]

  • Ahmad Makhdoomi 1
  • Maryam Sarkhosh 2
  • Somayeh Ziaei 3

1 Master's Student in Environmental Health Engineering, Member of the Student Research Committee, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran

2 Associate Professor, Department of Environmental Health, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran

3 Postdoctoral Researcher, School of Health, Member of the Student Research Committee, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran.

چکیده [English]

Background and Purpose: This study aims to forecast PM2.5 concentrations using four non-linear Machine Learning (ML) models.

Materials and Methods: The ML techniques employed include Light Gradient Boosting Machine (LGBM), Extreme Gradient Boosting Regressor (XGBR), Random Forest (RF), and Gradient Boosting Regressor (GBR). Meteorological and pollutant data were collected to predict the Air Quality Index (AQI) in Mashhad, Khorasan Razavi Province, Iran, for the period from 2016 to 2022.

Results: The ML models performed exceptionally well in predicting PM2.5 concentrations, with approximately 95% of their predictions falling within a factor of the observed values. Additionally, the predicted PM2.5 concentrations were compared with observed values to assess prediction accuracy. Among the four ML models, GBR demonstrated the best performance, achieving high accuracy metrics, including a coefficient of determination (R²) of 0.9802, a mean absolute error (MAE) of 0.54, a mean squared error (MSE) of 5.33, a root mean squared error (RMSE) of 2.31, and a mean absolute percentage error (MAPE) of 1.9%.

Conclusion: This study proposes a high-accuracy PM2.5 prediction method using ML, which can be beneficial for global air quality monitoring and improving acute exposure assessments in epidemiological research.
 
Open Access Policy: This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/

کلیدواژه‌ها [English]

  • Air quality index
  • PM2.5
  • Machine learning
  • Non-linear Models
  1. Suleiman A, Tight MR, Quinn AD. Applying machine learning methods in managing urban concentrations of traffic-related particulate matter (PM10 and PM2.5). Atmospheric Pollution Research. 2019;10(1):134-44. https://doi.org/10.1016/j.apr.2018.07.001
  2. He B, Xu H-M, Liu H-W, Zhang Y-F. Unique regulatory roles of ncRNAs changed by PM2.5 in human diseases. Ecotoxicology and Environmental Safety. 2023;255:114812. https://doi.org/10.1016/j.ecoenv.2023.114812 PMid:36963186
  3. McCarron A, Semple S, Braban CF, Swanson V, Gillespie C, Price HD. Public engagement with air quality data: using health behaviour change theory to support exposure-minimising behaviours. Journal of Exposure Science & Environmental Epidemiology. 2023;33(3):321-31. https://doi.org/10.1038/s41370-022-00449-2 PMid:35764891 PMCid:PMC10234807
  4. Yang H, Wang W, Li G. Prediction method of PM2.5 concentration based on decomposition and integration. Measurement. 2023;216:112954. https://doi.org/10.1016/j.measurement.2023.112954
  5. Hardini M, Sunarjo RA, Asfi M, Riza Chakim MH, Ayu Sanjaya YP. Predicting Air Quality Index using Ensemble Machine Learning. ADI Journal on Recent Innovation. 2023;5(1Sp):78-86. https://doi.org/10.34306/ajri.v5i1Sp.981
  6. Wu C-l, He H-d, Song R-f, Zhu X-h, Peng Z-r, Fu Q-y, et al. A hybrid deep learning model for regional O3 and NO2 concentrations prediction based on spatiotemporal dependencies in air quality monitoring network. Environmental Pollution. 2023;320:121075. https://doi.org/10.1016/j.envpol.2023.121075 PMid:36641063
  7. Doan QC, Chen C, He S, Zhang X. How urban air quality affects land values: Exploring non-linear and threshold mechanism using explainable artificial intelligence. Journal of Cleaner Production. 2024;434:140340. https://doi.org/10.1016/j.jclepro.2023.140340
  8. Sun J, Gong J, Zhou J. Estimating hourly PM2.5 concentrations in Beijing with satellite aerosol optical depth and a random forest approach. Science of The Total Environment. 2021;762:144502. https://doi.org/10.1016/j.scitotenv.2020.144502 PMid:33360341
  9. Yang L, Xu H, Yu S. Estimating PM2.5 concentrations in Yangtze River Delta region of China using random forest model and the Top-of-Atmosphere reflectance. Journal of Environmental Management. 2020;272:111061. https://doi.org/10.1016/j.jenvman.2020.111061 PMid:32669259
  10. Kim B-Y, Lim Y-K, Cha JW. Short-term prediction of particulate matter (PM10 and PM2.5) in Seoul, South Korea using tree-based machine learning algorithms. Atmospheric Pollution Research. 2022;13(10):101547. https://doi.org/10.1016/j.apr.2022.101547
  11. Gardner MW, Dorling SR. Statistical surface ozone models: an improved methodology to account for non-linear behaviour. Atmospheric Environment. 2000;34(1):21-34. https://doi.org/10.1016/S1352-2310(99)00359-3
  12. Berrocal VJ, Guan Y, Muyskens A, Wang H, Reich BJ, Mulholland JA, et al. A comparison of statistical and machine learning methods for creating national daily maps of ambient PM2.5 concentration. Atmospheric Environment. 2020;222:117130. https://doi.org/10.1016/j.atmosenv.2019.117130 PMid:32863727 PMCid:PMC7451200
  13. Ghahremanloo M, Choi Y, Sayeed A, Salman AK, Pan S, Amani M. Estimating daily high-resolution PM2.5 concentrations over Texas: Machine Learning approach. Atmospheric Environment. 2021;247:118209. https://doi.org/10.1016/j.atmosenv.2021.118209
  14. Tang D, Liu D, Tang Y, Seyler BC, Deng X, Zhan Y. Comparison of GOCI and Himawari-8 aerosol optical depth for deriving full-coverage hourly PM2.5 across the Yangtze River Delta. Atmospheric Environment. 2019;217:116973. https://doi.org/10.1016/j.atmosenv.2019.116973
  15. Wang W, Mao F, Du L, Pan Z, Gong W, Fang S. Deriving Hourly PM2.5 Concentrations from Himawari-8 AODs over Beijing-Tianjin-Hebei in China. Remote Sensing [Internet]. 2017; 9(8). https://doi.org/10.3390/rs9080858
  16. Williams DR, Rast P. Back to the basics: Rethinking partial correlation network methodology. British Journal of Mathematical and Statistical Psychology. 2020;73(2):187-212. https://doi.org/10.1111/bmsp.12173 PMid:31206621 PMCid:PMC8572131
  17. Demir E, Bilgin MH, Karabulut G, Doker AC. The relationship between cryptocurrencies and COVID-19 pandemic. Eurasian Economic Review. 2020;10(3):349-60. https://doi.org/10.1007/s40822-020-00154-1 PMCid:PMC7388435
  18. Chen H, Li X, Feng Z, Wang L, Qin Y, Skibniewski MJ, et al. Shield attitude prediction based on Bayesian-LGBM machine learning. Information Sciences. 2023;632:105-29. https://doi.org/10.1016/j.ins.2023.03.004
  19. Asselman A, Khaldi M, Aammou S. Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments. 2023;31(6):3360-79. https://doi.org/10.1080/10494820.2021.1928235
  20. Tran DA, Tsujimura M, Ha NT, Nguyen VT, Binh DV, Dang TD, et al. Evaluating the predictive power of different machine learning algorithms for groundwater salinity prediction of multi-layer coastal aquifers in the Mekong Delta, Vietnam. Ecological Indicators. 2021;127:107790. https://doi.org/10.1016/j.ecolind.2021.107790
  21. T R P. A Comparative Study on Decision Tree and Random Forest Using R Tool. IJARCCE. 2015:196-9. https://doi.org/10.17148/IJARCCE.2015.4142
  22. Otchere DA, Ganat TOA, Ojero JO, Tackie-Otoo BN, Taki MY. Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions. Journal of Petroleum Science and Engineering. 2022;208:109244. https://doi.org/10.1016/j.petrol.2021.109244
  23. Chicco D, Warrens M, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science. 2021;7:e623. https://doi.org/10.7717/peerj-cs.623 PMid:34307865 PMCid:PMC8279135
  24. Althoff D, Rodrigues LN. Goodness-of-fit criteria for hydrological models: Model calibration and performance assessment. Journal of Hydrology. 2021;600:126674. https://doi.org/10.1016/j.jhydrol.2021.126674
  25. Liu X, Zou B, Feng H, Liu N, Zhang H. Anthropogenic factors of PM2.5 distributions in China's major urban agglomerations: A spatial-temporal analysis. Journal of Cleaner Production. 2020;264:121709. https://doi.org/10.1016/j.jclepro.2020.121709
  26. Mohammadi M, Hatami M, Esmaeli R, Gohari S, Mohammadi M, khayami E. Relationships between Ambient Air Pollution, Meteorological Parameters and Respiratory Mortality in Mashhad, Iran: a Time Series Analysis. Pollution. 2022;8(4):1250-65.)Persian)
  27. Harrison R. Airborne particulate matter. Philosophical transactions Series A, Mathematical, physical, and engineering sciences. 2020;378:20190319. https://doi.org/10.1098/rsta.2019.0319 PMid:32981435 PMCid:PMC7536032
  28. Aminiyan MM, Kalantzi O-I, Etesami H, Khamoshi SE, Hajiali Begloo R, Aminiyan FM. Occurrence and source apportionment of polycyclic aromatic hydrocarbons (PAHs) in dust of an emerging industrial city in Iran: implications for human health. Environmental Science and Pollution Research. 2021;28(44):63359-76. https://doi.org/10.1007/s11356-021-14839-w PMid:34231139
  29. Maciejczyk P, Chen L-C, Thurston G. The Role of Fossil Fuel Combustion Metals in PM2.5 Air Pollution Health Associations. Atmosphere [Internet]. 2021; 12(9). https://doi.org/10.3390/atmos12091086
  30. Bilal M, Hassan M, Tahir DBT, Iqbal MS, Shahid I. Understanding the role of atmospheric circulations and dispersion of air pollution associated with extreme smog events over South Asian megacity. Environmental Monitoring and Assessment. 2022;194(2):82. https://doi.org/10.1007/s10661-021-09674-y PMid:35013892
  31. Pal S, Das P, Mandal I, Sarda R, Mahato S, Nguyen K-A, et al. Effects of lockdown due to COVID-19 outbreak on air quality and anthropogenic heat in an industrial belt of India. Journal of Cleaner Production. 2021;297:126674. https://doi.org/10.1016/j.jclepro.2021.126674 PMid:34975233 PMCid:PMC8714179
  32. Kanawade VP, Srivastava AK, Ram K, Asmi E, Vakkari V, Soni VK, et al. What caused severe air pollution episode of November 2016 in New Delhi? Atmospheric Environment. 2020;222:117125. https://doi.org/10.1016/j.atmosenv.2019.117125
  33. Hamzeh NH, Karami S, Kaskaoutis DG, Tegen I, Moradi M, Opp C. Atmospheric Dynamics and Numerical Simulations of Six Frontal Dust Storms in the Middle East Region. Atmosphere [Internet]. 2021; 12(1). https://doi.org/10.3390/atmos12010125