Development of a machine learning approach for local-scale ozone forecasting: application to Kennewick, WA

Fan, K.; Dhammapala, R.; Harrington, K.; Lamastro, R.; Lamb, B.; Lee, Y.

Development of a machine learning approach for local-scale ozone forecasting: application to Kennewick, WA

Tools

Preview	PDF (Original Article) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 1MB
	Other (Supplementary Material) 863kB

Item Type:	Article
Title:	Development of a machine learning approach for local-scale ozone forecasting: application to Kennewick, WA
Creators Name:	Fan, K., Dhammapala, R., Harrington, K., Lamastro, R., Lamb, B. and Lee, Y.
Abstract:	Chemical transport models (CTMs) are widely used for air quality forecasts, but these models require large computational resources and often suffer from a systematic bias that leads to missed poor air pollution events. For example, a CTM-based operational forecasting system for air quality over the Pacific Northwest, called AIRPACT, uses over 100 processors for several hours to provide 48-h forecasts daily, but struggles to capture unhealthy O(3) episodes during the summer and early fall, especially over Kennewick, WA. This research developed machine learning (ML) based O(3) forecasts for Kennewick, WA to demonstrate an improved forecast capability. We used the 2017-2020 simulated meteorology and O(3) observation data from Kennewick as training datasets. The meteorology datasets are from the Weather Research and Forecasting (WRF) meteorological model forecasts produced daily by the University of Washington. Our ozone forecasting system consists of two ML models, ML1 and ML2, to improve predictability: ML1 uses the random forest (RF) classifier and multiple linear regression (MLR) models, and ML2 uses a two-phase RF regression model with best-fit weighting factors. To avoid overfitting, we evaluate the ML forecasting system with the 10-time, 10-fold, and walk-forward cross-validation analysis. Compared to AIRPACT, ML1 improved forecast skill for high-O(3) events and captured 5 out of 10 unhealthy O(3) events, while AIRPACT and ML2 missed all the unhealthy events. ML2 showed better forecast skill for less elevated-O(3) events. Based on this result, we set up our ML modeling framework to use ML1 for high-O(3) events and ML2 for less elevated O(3) events. Since May 2019, the ML modeling framework has been used to produce daily 72-h O(3) forecasts and has provided forecasts via the web for clean air agency and public use: http://ozonematters.com/. Compared to the testing period, the operational forecasting period has not had unhealthy O(3) events. Nevertheless, the ML modeling framework demonstrated a reliable forecasting capability at a selected location with much less computational resources. The ML system uses a single processor for minutes compared to the CTM-based forecasting system using more than 100 processors for hours.
Keywords:	Machine Learning, Air Quality Forecasts, Ozone, Random Forest, Multiple Linear Regression
Source:	Frontiers in Big Data
ISSN:	2624-909X
Publisher:	Frontiers Media SA
Volume:	5
Page Range:	781309
Date:	10 February 2022
Official Publication:	https://doi.org/10.3389/fdata.2022.781309
PubMed:	View item in PubMed

Repository Staff Only: item control page

Download Statistics

Downloads

Downloads per month over past year