Reputation: 1780
My series s
looks something that looks like:
0 0 days 09:14:29.142000
1 0 days 00:01:08.060000
2 1 days 00:08:40.192000
3 0 days 17:52:18.782000
4 0 days 01:56:44.696000
dtype: timedelta64[ns]
I'm having trouble understanding how to pull out the hours (rounded to the nearest hour)
Edit:
I realize I can do something like s[0].hours
, which gives me 9L
. So I can do s[0].hours + 24*s[0].days
and then round accordingly using the minutes.
How I can do this on the entire series at once?
Upvotes: 0
Views: 4378
Reputation: 128928
This is right out of the docs here. And this is vectorized.
In [16]: s
Out[16]:
0 0 days 09:14:29.142000
1 0 days 00:01:08.060000
2 1 days 00:08:40.192000
3 0 days 17:52:18.782000
4 0 days 01:56:44.696000
Name: 0, dtype: timedelta64[ns]
In [17]: s.dt.components
Out[17]:
days hours minutes seconds milliseconds microseconds nanoseconds
0 0 9 14 29 142 0 0
1 0 0 1 8 60 0 0
2 1 0 8 40 192 0 0
3 0 17 52 18 782 0 0
4 0 1 56 44 696 0 0
In [18]: s.dt.components.hours
Out[18]:
0 9
1 0
2 0
3 17
4 1
Name: hours, dtype: int64
Here's another way to approach this if you don't need the actual hours attribute, but the Timedelta in terms of another unit (this is called frequency conversion)
In [31]: s/pd.Timedelta('1h')
Out[31]:
0 9.241428
1 0.018906
2 24.144498
3 17.871884
4 1.945749
dtype: float64
In [32]: np.ceil(s/pd.Timedelta('1h'))
Out[32]:
0 10
1 1
2 25
3 18
4 2
dtype: float64
Upvotes: 3
Reputation: 22443
Let's assume your time delta column there is called "Delta". Then you can do it this way:
df['rh'] = df.Delta.apply(lambda x: round(pd.Timedelta(x).total_seconds() \
% 86400.0 / 3600.0) )
Each time delta is really a numpy.timedelta64
under the covers. It helps to cast it to a pandas Timedelta
which has more convenient methods. Here I just ask for the number of total seconds, lop off any multiples of 86400 (i.e. numbers that indicate full days), and divide by 3600 (number of seconds in an hour). That gives you a floating point number of hours, which you then round.
I assumed, btw, that you wanted just the hour, minutes, seconds, and partial seconds components considered in the rounded hours, but not the full days. If you want all the hours, including the days, just omit the modulo operation that lops off days:
df['rh2'] = df.Delta.apply(lambda x: round(pd.Timedelta(x).total_seconds() \
/ 3600.0) )
Then you get:
It's also possible to do these calculations directly in numpy terms:
df['rh'] = df.Delta.apply(lambda x: round(x / np.timedelta64(1, 'h')) % 24 )
df['rh2'] = df.Delta.apply(lambda x: round(x / np.timedelta64(1, 'h')) )
Where np.timedelta64(1, 'h')
provides the number of nanoseconds in 1 hour, and the optional % 24
lops off whole day components (if desired).
Upvotes: 0