Create a Pandas Timestamp and date_range Object

Let's find out how to use a pandas Timestamp and date_range.

We'll cover the following

Try it yourself

Try executing the code below to see the result.

Press + to interact
import pandas as pd
start = pd.Timestamp.fromtimestamp(0).strftime('%Y-%m-%d')
times = pd.date_range(start=start, freq='M', periods=2)
print(times)

Explanation

There are couple of puzzling things here:

  • M is a month frequency.
  • The first date is January 31st and not January 1st.

Let’s start with M being a month frequency. You’ve probably used the infamous strftime or its cousin strptime to convert datetime to or from strings. In those cases, M stands for minute:

In [1]: t = pd.Timestamp(2020, 5, 10, 14, 21, 30)
In [2]: t.strftime('%H:%M')
Out[2]: '14:21'

One of the things most programmers like about pandas is that it’s one of the best-documented open-source packages available. But, pandas is a big library, and sometimes it’s hard to find what we’re looking for.

If we look at the pandas.date_range documentation, we’ll see the following:

"freq: str or DateOffset, default ‘D’

Frequency strings can have multiples, e.g., ‘5H’."

The M in date_range stands for month-end frequency whereas the minute frequency is T or min.

This solves one puzzle and also gives us a hint as to why we see January 31st and not January 1st. In the puxxle Y3K, we saw that time 0, or epoch time, is counted from January 1, 1970.

In [3]: pd.Timestamp(0)
Out[3]: Timestamp('1970-01-01 00:00:00')

If we follow the code of pandas.date_range, we’ll see it converts the freq from str to a pandas.DateOffset. Then, the date_range will use pandas.DataOffset.apply on start. From there it’ll add the offset for period times. Here’s how this is done:

In [4]: from pandas.tseries.frequencies import to_offset
In [5]: start = pd.Timestamp(0)
In [6]: offset = to_offset('M')
In [7]: offset
Out[7]: <MonthEnd>
In [8]: t0 = offset.apply(start)
In [9]: t0
Out[9]: Timestamp('1970-01-31 00:00:00')
In [10]: t0 + offset
Out[10]: Timestamp('1970-02-28 00:00:00')

This is what we see in this teaser’s output. Note that frequencies don’t have to be whole units. The following will give us a date range in five-minute intervals.

In [11]: pd.date_range(start=pd.Timestamp(0), periods=3, freq='5T')
Out[21]:
DatetimeIndex(['1970-01-01 00:00:00', '1970-01-01 00:05:00',
'1970-01-01 00:10:00'],
dtype='datetime64[ns]', freq='5T')

Get hands-on with 1300+ tech skills courses.