Spatiotemporally Continuous Reconstruction of Retrieved PM<sub>2.5</sub> Data Using an Autogeoi-Stacking Model in the Beijing-Tianjin-Hebei Region, China

oleh: Wenhao Chu, Chunxiao Zhang, Yuwei Zhao, Rongrong Li, Pengda Wu

Format: Article
Diterbitkan: MDPI AG 2022-09-01

Deskripsi

Aerosol optical depth (AOD) observations have been widely used to generate wide-coverage PM<sub>2.5</sub> retrievals due to the adverse effects of long-term exposure to PM<sub>2.5</sub> and the sparsity and unevenness of monitoring sites. However, due to non-random missing and nighttime gaps in AOD products, obtaining spatiotemporally continuous hourly data with high accuracy has been a great challenge. Therefore, this study developed an automatic geo-intelligent stacking (autogeoi-stacking) model, which contained seven sub-models of machine learning and was stacked through a Catboost model. The autogeoi-stacking model used the automated feature engineering (autofeat) method to identify spatiotemporal characteristics of multi-source datasets and generate extra features through automatic non-linear changes of multiple original features. The 10-fold cross-validation (CV) evaluation was employed to evaluate the 24-hour and continuous ground-level PM<sub>2.5</sub> estimations in the Beijing-Tianjin-Hebei (BTH) region during 2018. The results showed that the autogeoi-stacking model performed well in the study area with the coefficient of determination (R<sup>2</sup>) of 0.88, the root mean squared error (RMSE) of 17.38 µg/m<sup>3</sup>, and the mean absolute error (MAE) of 10.71 µg/m<sup>3</sup>. The estimated PM<sub>2.5</sub> concentrations had an excellent performance during the day (8:00–18:00, local time) and night (19:00–07:00) (the cross-validation coefficient of determination (CV-R<sup>2</sup>): 0.90, 0.88), and captured hourly PM<sub>2.5</sub> variations well, even in the severe ambient air pollution event. On the seasonal scale, the R<sup>2</sup> values from high to low were winter, autumn, spring, and summer, respectively. Compared with the original stacking model, the improvement of R<sup>2</sup> with the autofeat and hyperparameter optimization approaches was up to 5.33%. In addition, the annual mean values indicated that the southern areas, such as Shijiazhuang, Xingtai, and Handan, suffered higher PM<sub>2.5</sub> concentrations. The northern regions (e.g., Zhangjiakou and Chengde) experienced low PM<sub>2.5</sub>. In summary, the proposed method in this paper performed well and could provide ideas for constructing geoi-features and spatiotemporally continuous inversion products of PM<sub>2.5</sub>.