Dror
Dror

Reputation: 13051

Confusing datetime objects in pandas

I face some confusion with the way pandas is handling time-related objects.

If I do

x = pd.datetime.fromtimestamp(1440502703064/1000.) # or
x = pd.datetime(1234,5,6)

then type(x) returns datetime.datetime in either of the cases. However if I have:

z = pd.DataFrame([
    {'a': 'foo', 'ts': pd.datetime.fromtimestamp(1440502703064/1000.)}
    ])

then type(z['ts'][0]) returns pandas.tslib.Timestamp. When is this casting happening? Is its trigger is pandas or maybe numpy? What is this type that I obtain in the latter case and where is it documented?

Upvotes: 2

Views: 104

Answers (1)

user707650
user707650

Reputation:

I'm not 100% sure, since I haven't studied the underlying code, but the conversion from datetime.datetime happens the moment the value is "incorporated" into a DataFrame.

Outside a DataFrame, pandas will try to do the smart thing and return something sensible when using pd.dattime(.fromtimestamp): it returns a Python datetime.datetime object.

Inside, it uses something it can probably work better with internally. You can see the conversion occurring when creating a DataFrame by using a datetime.datetime object instead:

>>> from datetime import datetime
>>> z = pd.DataFrame([
       {'a': 'foo', 'ts': datetime(2015,8,27)} ])
>>> type(z['ts'][0])
pandas.tslib.Timestamp

Perhaps even clearer:

>>> pd.datetime == datetime
True

So the conversion happens during the DataFrame initialisation.

As for documentation, I searched around and found the source (note: probably not a very time-safe link), which says (doc-string):

TimeStamp is the pandas equivalent of python's Datetime and is interchangable with it in most cases. It's the type used for the entries that make up a DatetimeIndex, and other timeseries oriented data structures in pandas.

Upvotes: 1

Related Questions