Reputation: 231
I have a dataframe that looks like this :
Place | Name | 2019-03-01 | 2019-03-02 | ... | 2021-03-02 |
---|---|---|---|---|---|
Paris | Sam | 2 | 0 | ... | 0 |
Paris | Steve | 12 | 4 | ... | 0 |
Dublin | Sam | 8 | 1 | ... | 0 |
I have something like hundreds of the date-named columns.
The Name
column contains only Sam or Steve.
I want to create a 'Date' column that contains the dates previously in the date-named columns and create a sum_SAM
and sum_STEVE
columns filled with the sum of the values for each one of Sam and Steve.
So I would get something like this :
Place | Date | sum_SAM | sum_STEVE |
---|---|---|---|
Paris | 2019-03-01 | 2 | 12 |
Dublin | 2019-03-01 | 8 | 0 |
Paris | 2021-03-02 | 0 | 4 |
Dublin | 2021-03-02 | 1 | 0 |
I guess I have to transpose the date-named columns into a new column named 'Date', and then group by place and Date, and sum it. But I really don't know how to do it. The fact that there's a lot of the date-named columns scares me lol
Upvotes: 0
Views: 31
Reputation: 323316
Check with
out = df.set_index(['Place','Name']).stack().sum(level=[0,1,2],axis=0).unstack(level=1)
Upvotes: 3