Out-of-Sample Forecasting and Evaluation

Learn to perform out-of-sample forecasts and how to evaluate them.

Motivation

The ultimate goal of many time series models is to be able to predict the future realizations of a process. A good time series model needs to forecast at an acceptable level of accuracy in production, using data that it hasn’t seen before. In other words, the model must perform well out-of-sample.

Out-of-sample forecasting can come in many flavors, though. When developing our solution, we need to ask ourselves questions such as:

  • Do we want to forecast one or multiple steps ahead?

  • How many steps ahead is it useful/necessary to forecast? How many of these can we confidently predict?

  • How often do we want to update our model? Is it worth the cost?

  • What is an acceptable level of performance?

Our use case will usually determine the answer to most of these questions. Demand prediction, for instance, might require a constant update of our model and near real-time predictions. For more stable processes, such as our temperatures dataset, we might not need to update our model very often. However, we might still need to update our multiple-step-ahead forecasts as soon as we get new data. The question about the acceptable level of performance requires choosing an evaluation metric. This will typically be the MSE or the RMSE.

Appending new data

The ARIMAResults object in statsmodels has a handy method called append() for updating the model’s data. This method creates a new results object with an updated dataset, including the latest data that comes after the original model’s last observation. For instance, imagine that we have trained a model using temperatures from January 1, 2020, to December 31, 2020. If we were to apply the append() method, we’d need to give it data starting from January 1, 2021.

The code snippet below produces the documentation of the append() method:

Get hands-on with 1200+ tech skills courses.