Reputation: 970
I have several columns in a pd.DataFrame in which decimal separates hours and minutes (e.g., 3.15 = 3 hours, 15 minutes). Is there a quick way to convert this so that the data are recognized as h.m ? The pandas Time Series documentation doesn't seem to apply to my case. I don't have or want to attach any dates.
I tried:
# create df
hour_min = pd.DataFrame({'a': [4.5, 2.3, 3.17],
'b': [2.12, 1.13, 9.13],
'c': [8.23, 9.14, 7.45]})
# convert to hours
hour_min.astype('timedelta64[h]')
which gives
a b c
0 04:00:00 02:00:00 08:00:00
1 02:00:00 01:00:00 09:00:00
2 03:00:00 09:00:00 07:00:00
but I want
a b c
0 04:50 02:12 08:23
1 02:30 01:13 09:14
2 03:17 09:13 07:45
I also need the following type of result from adding/subtracting column values 1.32 + 1.32 = 3.04
Upvotes: 2
Views: 5086
Reputation: 1922
You're going to want to use pd.to_timedelta
in a function and applymap
it to get the math you want. Looks something like this:
import pandas as pd
import math
def to_t_delt(number):
return pd.to_timedelta(f'{math.floor(number)}hours {(number - math.floor(number)) * 100}min')
hour_min = pd.DataFrame({'a': [4.5, 2.3, 3.17],
'b': [2.12, 1.13, 9.13],
'c': [8.23, 9.14, 7.45]})
hour_min = hour_min.applymap(to_t_delt)
print(hour_min)
print()
print(hour_min['a'] + hour_min['b'])
and yields this result:
a b c
0 04:50:00 02:12:00 08:23:00
1 02:30:00 01:13:00 09:14:00
2 03:17:00 09:13:00 07:45:00
0 07:02:00
1 03:43:00
2 12:30:00
dtype: timedelta64[ns]
Upvotes: 2
Reputation: 6518
I'm pretty sure there should be a more efficient solution, but since no one answered yet, here is a hacky workaround:
import pandas as pd
hour_min = pd.DataFrame({'a': [4.5, 2.3, 3.17],
'b': [2.12, 1.13, 9.13],
'c': [8.23, 9.14, 7.45]})
def convert(number):
hour = ('%.2f' % number).split(sep='.')[0]
minute = ('%.2f' % number).split(sep='.')[1]
return pd.to_datetime(hour+":"+minute, format='%H:%M')
Then you just need to use applymap()
:
>>> hour_min = hour_min.applymap(convert)
>>> hour_min
a b c
0 1900-01-01 04:50:00 1900-01-01 02:12:00 1900-01-01 08:23:00
1 1900-01-01 02:30:00 1900-01-01 01:13:00 1900-01-01 09:14:00
2 1900-01-01 03:17:00 1900-01-01 09:13:00 1900-01-01 07:45:00
You can show the time only with:
>>> for i in hour_min:
hour_min[i] = hour_min[i].dt.time
>>> hour_min
a b c
0 04:50:00 02:12:00 08:23:00
1 02:30:00 01:13:00 09:14:00
2 03:17:00 09:13:00 07:45:00
Upvotes: 5