Add a new column of timedelta values in a particular format in Pandas

Question

I have a dataframe and I want to add a column containing time difference between two another column:

 df[Diff] = df['End Time'] - df['Open Time']
 df[Diff]
 0     0 days 01:25:40
 1     0 days 00:41:57
 2     0 days 00:21:47
 3     0 days 16:41:57
 4     0 days 04:32:00
 5     0 days 03:01:57
 6     0 days 01:37:56
 7     0 days 01:13:57
 8     0 days 01:07:56
 9     0 days 02:33:59
 10   29 days 18:33:53
 11    0 days 03:50:56
 12    0 days 01:57:56

I would like to have this column in format '1h 25m', so I have tried to calculate days in hours:

diff = df['End Time'] - df['Open Time']
hours = diff.dt.days * 24 + diff.dt.components.hours
minutes = diff.dt.components.minutes

and received these results:

0       1
1       0
2       0
3      16
4       4
5       3
6       1
7       1
8       1
9       2
10    714
11      3
12      1
dtype: int64h 0     25
1     41
2     21
3     41
4     32
5      1
6     37
7     13
8      7
9     33
10    33
11    50
12    57
Name: minutes, dtype: int64m

But I can't express these results in this format in new column:

 '{}h {}m'.format(hours,minutes))

cs95 · Accepted Answer

You could extract the relevant columns, convert to str using astype, and just concat the cols as needed.

c = (df['End Time'] - df['Open Time'])\
              .dt.components[['days', 'hours', 'minutes']]
df['diff'] = (c.days * 24 + c.hours).astype(str) + 'h ' + c.minutes.astype(str) + 'm'
df['diff']
0       1h 25m
1       0h 41m
2       0h 21m
3      16h 41m
4       4h 32m
5        3h 1m
6       1h 37m
7       1h 13m
8        1h 7m
9       2h 33m
10    714h 33m
11      3h 50m
12      1h 57m
Name: diff, dtype: object

Add a new column of timedelta values in a particular format in Pandas

Answers (2)

Related Questions