Ardor Orenda
Ardor Orenda

Reputation: 51

Integer to date in Python: unsupported type for timedelta days component: Series

I want to create a new column DATE in the dataframe transaction from the column DAY (lists days from beginning of the study). Like, if DAY is 1, I want DATE to show 2014-1-1, or if DAY is 12 it should be 2014-1-12. This is what I am doing to convert days into date and it works:

import datetime
def serial_date_to_string(srl_no):
    new_date = datetime.datetime(2014,1,1,0,0) + datetime.timedelta(srl_no - 1)
    return new_date.strftime("%Y-%m-%d")

However, when I try to use this formula to add a new column, it doesn't work:

transaction['DATE'] = serial_date_to_string(transaction['DAY'])

TypeError: unsupported type for timedelta days component: Series

But the DAY column type is int64. I tried to search on forums and found that the formula could be adjusted, if I try to use this:

def serial_date_to_string(srl_no):
    new_date = datetime.datetime(2014,1,1,0,0) + (srl_no - 1).map(datetime.timedelta)
    return new_date.strftime("%Y-%m-%d")

It still gives AttributeError: 'Series' object has no attribute 'strftime'.

enter image description here

Thank you for any help!

Upvotes: 1

Views: 2104

Answers (2)

jezrael
jezrael

Reputation: 863056

Use to_datetime with parameters origin for starting day and unit=d for days:

df = pd.DataFrame({'DAY':[1,12,20]})

df['date'] = pd.to_datetime(df['DAY'] - 1, origin='2014-01-01', unit='d')
print (df)
   DAY       date
0    1 2014-01-01
1   12 2014-01-12
2   20 2014-01-20

For same ouput add Series.dt.strftime:

df['date'] = pd.to_datetime(df['DAY'] - 1, origin='2014-01-01', unit='d').dt.strftime("%Y-%m-%d")
print (df)
   DAY        date
0    1  2014-01-01
1   12  2014-01-12
2   20  2014-01-20

EDIT:

For your function is possible use Series.apply:

transaction['DATE'] = transaction['DAY'].apply(serial_date_to_string)

Performance is similar, apply here is fastest for 10k rows:

import datetime
def serial_date_to_string(srl_no):
    new_date = datetime.datetime(2014,1,1,0,0) + datetime.timedelta(srl_no - 1)
    return new_date.strftime("%Y-%m-%d")
    

np.random.seed(2021)
df = pd.DataFrame({'DAY': np.random.randint(10, 1000, size=10000)})

In [17]: %timeit pd.to_datetime(df['DAY'] - 1, origin='2014-01-01', unit='d').dt.strftime("%Y-%m-%d")
79.4 ms ± 1.18 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [18]: %timeit df['DAY'].apply(serial_date_to_string)
57.1 ms ± 110 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [19]: %timeit df['DAY'].map(serial_date_to_string)
64.7 ms ± 5.83 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Upvotes: 1

piRSquared
piRSquared

Reputation: 294468

Use map

transaction['DATE'] = transaction['DAY'].map(serial_date_to_string)

Upvotes: 2

Related Questions