Statistical and Machine Learning Methods Applied to the Prediction of Tropical Rainfall

J. Wang, R. K. W. Wong, M. Jun, C. Schumacher and R. Saravanan


We explore the use of three advanced statistical and machine learning methods (a generalized linear model, random forest, and neural network) to predict the occurrence and rain rate distribution of three tropical rain types (deep convective, stratiform, and shallow convective) observed by the radar onboard the GPM satellite over the West Pacific. Three-hourly temperature and moisture fields from MERRA-2 were used as predictors. While all three methods perform reasonably well at predicting the occurrence of each rain type, the neural network is the only method able to produce rain rate distributions similar to observations, especially for the top 5-10% of observed values. However, the neural network took the most effort to train and has a relatively high root mean square error, suggesting that it sometimes assigns high rain rates to situations that in reality produce much weaker rain rates.