Resampling—Upsample and Downsample
Understand how to perform upsampling and downsampling of time series data in pandas.
We'll cover the following
Overview
Resampling is a powerful technique used in time series analysis to convert data from one frequency to another. It’s useful when working with time series data that has irregular intervals or when we want to analyze data at a different frequency than the original data.
We can think of resampling as similar to a GroupBy
operation, except that we aggregate based on a time frequency. In pandas
, resampling is made easy with resample()
. The resample()
method allows us to change the frequency of our time series data and provides two main types of resampling:
Downsampling: Involves reducing the frequency of the data, such as converting daily data to monthly data. This process usually requires an aggregation function to combine the data points within each new interval (e.g., mean, sum, or count) because we’re compressing the data into fewer data points with a lower resolution.
Upsampling: Involves increasing the frequency of the data, for example, converting monthly data to daily data. This process may require interpolation or forward/backward filling to fill in the missing data points in the new intervals because we’re expanding the data into more data points with a more granular resolution.
The output of the resample()
method is a Resampler
object upon which we can call different aggregation functions with various methods like mean()
and sum()
.
In this lesson, we’ll look at the ENTSOE hourly energy demand data for 2018, which comprises data on electrical consumption and generation for Spain. The dataset used here has been truncated to only keep the two columns of generation fossil gas and generation hard coal to simplify this example.
Get hands-on with 1200+ tech skills courses.