user7786493
user7786493

Reputation: 473

Different types of date type in pandas and python

I am working on building a trading strategy back-test that has to do with storing date as the index. Can someone explain the difference (and also the mutability when doing assignment) of the following type of data for date?

a=pd.date_range('1/1/2016',periods=10,freq='w')
b=datetime.datetime(2016,1,4)
c=pd.datetime(2016,1,4)
d=pd.Timestamp(153543453435)

When I print it, the data types are as below:

<class 'pandas.core.indexes.datetimes.DatetimeIndex'> (print(type(a))
<class 'pandas._libs.tslib.Timestamp'> (print(type(a[0]))
<class 'datetime.datetime'>
<class 'datetime.datetime'>
<class 'pandas._libs.tslib.Timestamp'>

It would be great if someone can explain in details the difference of them and the mutability when doing variable assignment?

Upvotes: 0

Views: 109

Answers (1)

Alex
Alex

Reputation: 19104

dti = pd.date_range('1/1/2016',periods=10,freq='w')

According to the docs DatetimeIndex is:

Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.

ts = dti[0]

Furthermore the pandas Timestamp object is designed to be immutable:

ts  # returns Timestamp('2016-01-03 00:00:00', freq='W-SUN')
ts.replace(year=2015)  # returns Timestamp('2015-01-03 00:00:00', freq='W-SUN')
ts  # returns Timestamp('2016-01-03 00:00:00', freq='W-SUN')

Note how the year of the original Timestamp object did not change. Instead the replace method returned a new Timestamp object.

Lastly, with respect to native python datetime objects, according to the python docs:

Objects of these types are immutable.

Here is a good SO post about converting between different types representing datetimes.

So why would you use one as opposed to another?

datetimes can be a pain to work with. That's why pandas created their own wrapper class (Timestamp). Metadata is stored on these objects that makes their manipulation easier. The DatetimeIndex is just a sequence of numpy datetime64 objects that are boxed into Timestamp objects for the added functionality. For example using Timestamp/DatetimeIndex you can:

  • Add a certain number of business days to a datetimeindex.
  • Create sequences that span a certain number of weeks.
  • Change timezones.
  • etc.

All of these things would be a royal pain without the extra methods and metadata stored on the Timestamp and DatetimeIndex classes.

Take a look at the pandas docs for more examples.

Upvotes: 3

Related Questions