Reputation: 473
I am working on building a trading strategy back-test that has to do with storing date as the index. Can someone explain the difference (and also the mutability when doing assignment) of the following type of data for date?
a=pd.date_range('1/1/2016',periods=10,freq='w')
b=datetime.datetime(2016,1,4)
c=pd.datetime(2016,1,4)
d=pd.Timestamp(153543453435)
When I print it, the data types are as below:
<class 'pandas.core.indexes.datetimes.DatetimeIndex'> (print(type(a))
<class 'pandas._libs.tslib.Timestamp'> (print(type(a[0]))
<class 'datetime.datetime'>
<class 'datetime.datetime'>
<class 'pandas._libs.tslib.Timestamp'>
It would be great if someone can explain in details the difference of them and the mutability when doing variable assignment?
Upvotes: 0
Views: 109
Reputation: 19104
dti = pd.date_range('1/1/2016',periods=10,freq='w')
According to the docs DatetimeIndex
is:
Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.
ts = dti[0]
Furthermore the pandas Timestamp
object is designed to be immutable:
ts # returns Timestamp('2016-01-03 00:00:00', freq='W-SUN')
ts.replace(year=2015) # returns Timestamp('2015-01-03 00:00:00', freq='W-SUN')
ts # returns Timestamp('2016-01-03 00:00:00', freq='W-SUN')
Note how the year of the original Timestamp
object did not change. Instead the replace method returned a new Timestamp
object.
Lastly, with respect to native python datetime
objects, according to the python docs:
Objects of these types are immutable.
Here is a good SO post about converting between different types representing datetimes.
So why would you use one as opposed to another?
datetimes
can be a pain to work with. That's why pandas created their own wrapper class (Timestamp
). Metadata is stored on these objects that makes their manipulation easier. The DatetimeIndex
is just a sequence of numpy datetime64
objects that are boxed into Timestamp
objects for the added functionality. For example using Timestamp
/DatetimeIndex
you can:
All of these things would be a royal pain without the extra methods and metadata stored on the Timestamp
and DatetimeIndex
classes.
Take a look at the pandas docs for more examples.
Upvotes: 3