shantanuo
shantanuo

Reputation: 32316

calculate total of string column

How do I calculate total of string columns in pandas?

myl=[('2012-11-07 19:16:07', ' 2012-11-07 19:21:07', ' 0h 05m 00s'),
 ('2012-11-13 06:16:07', ' 2012-11-13 06:21:07', ' 0h 05m 00s'),
 ('2012-11-15 09:56:07', ' 2012-11-15 11:41:07', ' 1h 45m 00s'),
 ('2012-11-15 22:26:07', ' 2012-11-16 07:01:07', ' 8h 35m 00s')]

import pandas as pd
df = pd.DataFrame(myl, columns=['from', 'to', 'downtime'])

The above code will return the "downtime" in a single column. How do I take the total of integer values in that column?

In [5]: df
Out[5]:
                  from                    to     downtime
0  2012-11-07 19:16:07   2012-11-07 19:21:07   0h 05m 00s
1  2012-11-13 06:16:07   2012-11-13 06:21:07   0h 05m 00s
2  2012-11-15 09:56:07   2012-11-15 11:41:07   1h 45m 00s
3  2012-11-15 22:26:07   2012-11-16 07:01:07   8h 35m 00s

For e.g. in the above output, expected total of downtime column would be 9h 90m 00s


Update:

And how do I calculate day-wise downtime?

Expected result:

2012-11-07 0h 05m 00s
2012-11-13 0h 05m 00s
2012-11-15 10h 20m 00s

This is working as expected:

df['downtime_t'] = pd.to_timedelta(df['downtime'])

df['year'] = pd.DatetimeIndex(pd.to_datetime(df['from'])).year
df['month'] = pd.DatetimeIndex(pd.to_datetime(df['from'])).month
df['day'] = pd.DatetimeIndex(pd.to_datetime(df['from'])).day

df.groupby(['year', 'month', 'day'])['downtime_t'].sum()

And this is also working for year grouping:

df['from_d2'] = pd.to_datetime(df['from'])
df.groupby(df['from_d2'].map(lambda x:  x.year ))['downtime_t'].sum()

But this does not work:

df.groupby(df['from_d2'].map(lambda x:  x.year, x.month, x.day))['downtime_t'].sum()

Is there any other way to achieve group by total?

Upvotes: 0

Views: 172

Answers (1)

Alex
Alex

Reputation: 19104

You can use pandas' to_timedelta function.

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_timedelta.html

pd.to_timedelta(df['downtime']).sum()

Upvotes: 2

Related Questions