Quant Christo
Quant Christo

Reputation: 1430

Summing columns according to pattern in column names

Lets start with very simplified abstract example, I hava a dataframe like this:

import pandas as pd

d = {'1-A': [1, 2], '1-B': [3, 4], '2-A': [3, 4], '5-B': [2, 7]}
df = pd.DataFrame(data=d)

     1-A  1-B  2-A  5-B
0    1    3    3    2
1    2    4    4    7

I'm looking for elegant pandastic solution to have dataframe like this:

     1    2    5
0    4    3    2
1    6    4    7

To make example more concrete column 1-A, means person id=1, expenses category A. Rows are expenses every month. In result, I want to have monthly expenses per person across categories (so column 1 is sum of column 1-A and 1-B). Note that, when there is no expenses, there is no column with 0s. Of course it should be ready for more columns (ids and categories).

I'm quite sure that smart solution with good separation of column selection and summing opeation for this exist.

Upvotes: 1

Views: 220

Answers (1)

jezrael
jezrael

Reputation: 862731

Use groupby with lambda function with split and select first value, for grouping by columns add axis=1:

df1 = df.groupby(lambda x: x.split('-')[0], axis=1).sum()
#alternative
#df1 = df.groupby(df.columns.str.split('-').str[0], axis=1).sum()
print (df1) 

   1  2  5
0  4  3  2
1  6  4  7

Upvotes: 1

Related Questions