An improved low-power measurement of ambient NO<sub>2</sub> and O<sub>3</sub> combining electrochemical sensor clusters and machine learning

oleh: K. R. Smith, P. M. Edwards, P. D. Ivatt, J. D. Lee, J. D. Lee, F. Squires, C. Dai, R. E. Peltier, M. J. Evans, M. J. Evans, Y. Sun, A. C. Lewis, A. C. Lewis

Format: Article
Diterbitkan: Copernicus Publications 2019-02-01

Deskripsi

<p>Low-cost sensors (LCSs) are an appealing solution to the problem of spatial resolution in air quality measurement, but they currently do not have the same analytical performance as regulatory reference methods. Individual sensors can be susceptible to analytical cross-interferences; have random signal variability; and experience drift over short, medium and long timescales. To overcome some of the performance limitations of individual sensors we use a clustering approach using the instantaneous median signal from six identical electrochemical sensors to minimize the randomized drifts and inter-sensor differences. We report here on a low-power analytical device (<span class="inline-formula"><i>&lt;</i> 200</span>&thinsp;W) that is comprised of clusters of sensors for <span class="inline-formula">NO<sub>2</sub></span>, <span class="inline-formula">O<sub><i>x</i></sub></span>, CO and total volatile organic compounds (VOCs) and that measures supporting parameters such as water vapour and temperature. This was tested in the field against reference monitors, collecting ambient air pollution data in Beijing, China. Comparisons were made of <span class="inline-formula">NO<sub>2</sub></span> and <span class="inline-formula">O<sub><i>x</i></sub></span> clustered sensor data against reference methods for calibrations derived from factory settings, in-field simple linear regression (SLR) and then against three machine learning (ML) algorithms. The parametric supervised ML algorithms, boosted regression trees (BRTs) and boosted linear regression (BLR), and the non-parametric technique, Gaussian process (GP), used all available sensor data to improve the measurement estimate of <span class="inline-formula">NO<sub>2</sub></span> and <span class="inline-formula">O<sub><i>x</i></sub></span>. In all cases ML produced an observational value that was closer to reference measurements than SLR alone. In combination, sensor clustering and ML generated sensor data of a quality that was close to that of regulatory measurements (using the RMSE metric) yet retained a very substantial cost and power advantage.</p>