Xiphias
Xiphias

Reputation: 4716

Why does pandas return timestamps instead of datetime objects when calling pd.to_datetime()?

According to the manual, pd.to_datetime() should create a datetime object.

Instead, when I call pd.to_datetime("2012-05-14"), I get a timestamp object! Calling to_datetime() on that object finally gives me a datetime object.

In [1]: pd.to_datetime("2012-05-14")
Out[1]: Timestamp('2012-05-14 00:00:00', tz=None)

In [2]: t = pd.to_datetime("2012-05-14")
In [3]: t.to_datetime()
Out[2]: datetime.datetime(2012, 5, 14, 0, 0)

Is there an explanation for this unexpected behaviour?

Upvotes: 20

Views: 13704

Answers (1)

joris
joris

Reputation: 139162

A Timestamp object is the way pandas works with datetimes, so it is a datetime object in pandas. But you expected a datetime.datetime object.
Normally you should not care about this (it is just a matter of a different repr). As long as you are working with pandas, the Timestamp is OK. And even if you really want a datetime.datetime, most things will work (eg all methods), and otherwise you can use to_pydatetime to retrieve the datetime.datetime object.

The longer story:

  • pandas stores datetimes as data with type datetime64 in index/columns (this are not datetime.datetime objects). This is the standard numpy type for datetimes and is more performant than using datetime.datetime objects:

     In [15]: df = pd.DataFrame({'A':[dt.datetime(2012,1,1), dt.datetime(2012,1,2)]})
    
     In [16]: df.dtypes
     Out[16]:
     A    datetime64[ns]
     dtype: object
    
     In [17]: df.loc[0,'A']
     Out[17]: Timestamp('2012-01-01 00:00:00', tz=None)
    
  • when retrieving one value of such a datetime column/index, you will see a Timestamp object. This is a more convenient object to work with the datetimes (more methods, better representation, etc than the datetime64), and this is a subclass of datetime.datetime, and so has all methods of it.

Upvotes: 25

Related Questions