Reputation: 1334
I would like to convert a dataframe of timedeltas into hours. I can do this for one series (one column of the dataframe) but I would like to find a way to apply it to all columns.
A for loop
works, but is there a faster or more pythonic way to do this?
import pandas as pd
import datetime
import numpy as np
df = pd.DataFrame({'a': pd.to_timedelta(['0 days 00:00:08','0 days 05:05:00', '0 days 01:01:57']),
'b' : pd.to_timedelta(['0 days 00:44:00','0 days 00:15:00','0 days 01:02:00']),
'c': pd.to_timedelta(['0 days 00:34:33','0 days 04:04:00','0 days 01:31:58'])})
df
a b c
0 00:00:08 00:44:00 00:34:33
1 05:05:00 00:15:00 04:04:00
2 01:01:57 01:02:00 01:31:58
for c in df.columns:
df[c] = (df[c]/np.timedelta64(1,'h')).astype(float)
df
a b c
0 0.002222 0.733333 0.575833
1 5.083333 0.250000 4.066667
2 1.032500 1.033333 1.532778
I've tried using lambda, but there's something I'm getting wrong:
df = df.apply(lambda x: x/np.timedeltat(1, 'h')).astype(float)
Returns the error:
AttributeError: ("'module' object has no attribute 'timedelta'", u'occurred at index a')
Upvotes: 1
Views: 2565
Reputation: 862611
Use np.timedelta64
working with all columns converted to 2d numpy array:
df = pd.DataFrame(df.values / np.timedelta64(1, 'h'), columns=df.columns, index=df.index)
print (df)
a b c
0 0.002222 0.733333 0.575833
1 5.083333 0.250000 4.066667
2 1.032500 1.033333 1.532778
If want use apply
:
df = df.apply(lambda x: x/np.timedelta64(1, 'h'))
print (df)
a b c
0 0.002222 0.733333 0.575833
1 5.083333 0.250000 4.066667
2 1.032500 1.033333 1.532778
Or total_seconds
:
df = df.apply(lambda x: x.dt.total_seconds() / 3600)
print (df)
a b c
0 0.002222 0.733333 0.575833
1 5.083333 0.250000 4.066667
2 1.032500 1.033333 1.532778
Upvotes: 2