Reputation: 4317
I have been looking at stack overflow questions to solve this problem but I haven't found a solution.
I have a dataframe df
that looks like this:
value
pod 22 72 79 86 87 88
time_stamp
2016-10-03 10.160000 0.000000 0.000000 32.004001 5.334000 11.176000
2016-10-10 0.000000 0.000000 0.000000 2.032000 0.000000 0.000000
2016-10-17 16.002001 0.000000 8.636000 21.336001 1.778000 6.604000
2016-10-24 2.032000 6.604000 71.628004 19.050001 0.508000 2.540000
2016-10-31 3.556000 21.590000 0.000000 0.000000 2.032000 2.794000
2016-11-07 3.302000 10.160000 0.762000 0.254000 1.270000 2.540000
2016-11-14 27.686001 44.704001 22.606001 1.524000 26.670001 42.164001
2016-11-21 68.072001 56.896002 14.732000 8.128000 23.114001 63.500002
The output when I do df.head(5).to_dict()
looks like this:
{('value', 22): {Timestamp('2016-10-03 00:00:00'): 10.159999966599999,
Timestamp('2016-10-10 00:00:00'): 0.0,
Timestamp('2016-10-17 00:00:00'): 16.0020005107,
Timestamp('2016-10-24 00:00:00'): 2.0320000648500001,
Timestamp('2016-10-31 00:00:00'): 3.5560001134900006},
('value', 72): {Timestamp('2016-10-03 00:00:00'): 0.0,
Timestamp('2016-10-10 00:00:00'): 0.0,
Timestamp('2016-10-17 00:00:00'): 0.0,
Timestamp('2016-10-24 00:00:00'): 6.6040000915499997,
Timestamp('2016-10-31 00:00:00'): 21.589999973800001},
('value', 79): {Timestamp('2016-10-03 00:00:00'): 0.0,
Timestamp('2016-10-10 00:00:00'): 0.0,
Timestamp('2016-10-17 00:00:00'): 8.6360000968000001,
Timestamp('2016-10-24 00:00:00'): 71.628004074100005,
Timestamp('2016-10-31 00:00:00'): 0.0},
I want to groupby the column and sum all the values. I'm getting stuck because this seems to be a multi-level index.
When I do this s = df.sum(axis=1, level=[1]); s
it just removes the first row: value
. How do I get the columns summed instead?
The output could just be a simple dataframe that sums the value
column so it could look something like:
pod 22 72 79...
2016 100 120 110...
2017 80 90 72...
Upvotes: 0
Views: 1068
Reputation: 402573
It seems you need sum
along the 0th axis:
df
value
22 72 79
2016-10-03 10.160000 0.000 0.000000
2016-10-10 0.000000 0.000 0.000000
2016-10-17 16.002001 0.000 8.636000
2016-10-24 2.032000 6.604 71.628004
2016-10-31 3.556000 21.590 0.000000
df.sum(axis=0).to_frame().T
value
22 72 79
0 31.750001 28.194 80.264004
If you want a grouping operation by year, call groupby
on index.year
:
df.groupby(df.index.year).sum()
value
22 72 79
2016 31.750001 28.194 80.264004
Upvotes: 2