Reputation: 16385
I'm trying to create a new Pandas dataframe column with ordinal day from a datetime column:
import pandas as pd
from datetime import datetime
print df.ix[0:5]
date
file
gom3_197801.nc 2011-02-16 00:00:00
gom3_197802.nc 2011-02-16 00:00:00
gom3_197803.nc 2011-02-15 00:00:00
gom3_197804.nc 2011-02-17 00:00:00
gom3_197805.nc 2011-11-14 00:00:00
df['date'][0].toordinal()
Out[6]:
734184
df['date'].toordinal()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-7-dbfd5e8b60f0> in <module>()
----> 1 df['date'].toordinal()
AttributeError: 'Series' object has no attribute 'toordinal'
I guess this is a basic question, but I've struggled reading docs for last 30 minutes.
How can I create an ordinal time column for my dataframe?
Upvotes: 17
Views: 22910
Reputation: 403
I hate to have to resort to apply
or map
so here's a more efficient approach (about 2x faster in my case). It uses np.vectorize
.
import pandas as pd
import numpy as np
def to_ordinal(dt):
return dt.toordinal()
vectorized_ordinal = np.vectorize(to_ordinal, otypes=['int'])
df = pd.DataFrame()
df['date'] = pd.date_range('2000-01-01', '2030-01-01', freq='d')
df['ordinal_date'] = vectorized_ordinal(dates)
Using np.vectorize
%timeit vectorized_ordinal(df['date'])
5.89 ms ± 447 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using apply
%timeit df['date'].apply(pd.Timestamp.toordinal)
11.2 ms ± 429 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using map
%timeit df['date'].map(pd.Timestamp.toordinal)
32.5 ms ± 1.74 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Upvotes: 1
Reputation: 23773
For completeness:
Apply pd.Timestamp.toordinal
df['date'].apply(pd.Timestamp.toordinal)
Upvotes: 3
Reputation: 77991
you can also use map
:
import datetime as dt
df['date'].map(dt.datetime.toordinal)
Upvotes: 12