Reputation: 1430
Lets start with very simplified abstract example, I hava a dataframe like this:
import pandas as pd
d = {'1-A': [1, 2], '1-B': [3, 4], '2-A': [3, 4], '5-B': [2, 7]}
df = pd.DataFrame(data=d)
1-A 1-B 2-A 5-B
0 1 3 3 2
1 2 4 4 7
I'm looking for elegant pandastic solution to have dataframe like this:
1 2 5
0 4 3 2
1 6 4 7
To make example more concrete column 1-A, means person id=1, expenses category A. Rows are expenses every month. In result, I want to have monthly expenses per person across categories (so column 1 is sum of column 1-A and 1-B). Note that, when there is no expenses, there is no column with 0s. Of course it should be ready for more columns (ids and categories).
I'm quite sure that smart solution with good separation of column selection and summing opeation for this exist.
Upvotes: 1
Views: 220
Reputation: 862731
Use groupby
with lambda function with split and select first value, for grouping by columns add axis=1
:
df1 = df.groupby(lambda x: x.split('-')[0], axis=1).sum()
#alternative
#df1 = df.groupby(df.columns.str.split('-').str[0], axis=1).sum()
print (df1)
1 2 5
0 4 3 2
1 6 4 7
Upvotes: 1