An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology

oleh: Mallory Lai, Shaun S. Wulff, Yongtao Cao, Timothy J. Robinson, Rasika Rajapaksha

Format: Article
Diterbitkan: Elsevier 2023-12-01

Deskripsi

Wastewater-based epidemiology has emerged as a viable tool for monitoring disease prevalence in a population. This paper details a time series machine learning (TSML) method for predicting COVID-19 cases from wastewater and environmental variables. The TSML method utilizes a number of techniques to create an interpretable, hypothesis-driven framework for machine learning that can handle different nowcast and forecast lengths. Some of the techniques employed include: • Feature engineering to construct interpretable features, like site-specific lead times, hypothesized to be potential predictors of COVID-19 cases. • Feature selection to identify features with the best predictive performance for the tasks of nowcasting and forecasting. • Prequential evaluation to prevent data leakage while evaluating the performance of the machine learning algorithm.