Forecasting insect abundance using time series embedding and machine learning

Implementing insect monitoring systems provides an excellent opportunity to create accurate interventions for insect control. However, selecting the appropriate time for an intervention is still an open question due to the inherent difficulty of implementing on-site monitoring in real-time. This decision is even more critical with insect species that can abruptly increase population size. A possible solution to enhance decision-making is to apply forecasting methods to predict insect abundance. However, another layer of complexity is added when other covariates are considered in the forecasting, such as climate time series collected along the monitoring system. Multiple possible combinations of climate time series and their lags can be used to build a forecasting method. Therefore, this research paper proposes a new approach to address this problem by combining statistics, machine learning, and time series embedding. We used two datasets containing a time series of aphids and climate data collected weekly in Coxilha and Passo Fundo municipalities in Southern Brazil for eight years. We conduct a simulation study based on a probabilistic autoregressive model with exogenous time series based on Poisson and negative binomial distributions to check the influence of incorporating climate time series on the performance of our approach. We pre-processed the data using our newly proposed approach and more straightforward approaches commonly used to train machine learning algorithms in time series problems. We evaluate the performance of the selected machine algorithms by looking at the Root Mean Squared Error obtained using one-step-ahead forecasting. Based on Random Forests, Lasso-regularised linear regression, and LightGBM regression algorithms, our novel approach yields competitive forecasts while automatically selecting insect abundances, climate time series and their lags to aid forecasting.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here