L-square
L-square

Reputation: 125

Summing certain columns by similar part of its name

How to sum columns by already fetched list of unique columns partly names ?

list = ['13-14', '15-16']

DataFrame:

                 X.13-14        Y.13-14       Z.13-14      X.15-16   ...
id                                                                        
182761           10274.00    6097173.00     5758902.00    3345841.00  

I.e. I want to create '13-14' and '15-16' columns with corresponding sum of (X.13-14,Y.13-14,Z.13-14), then (X.15-16,Y.15-16,Z.15-16)

Upvotes: 1

Views: 65

Answers (1)

jezrael
jezrael

Reputation: 862731

If want sum columns by columns names after . use lambda function in DataFrame.groupby with axis=1:

df1 = df.groupby(lambda x: x.split('.')[1], axis=1).sum()
print (df1)
             13-14      15-16
id                           
182761  11866349.0  3345841.0

Or if need only columns by list:

L = ['13-14', '15-16']

df.columns = df.columns.str.extract(f'({"|".join(L)})', expand=False)

df1 = df.sum(level=0, axis=1)[L]
print (df1)
             13-14      15-16
id                           
182761  11866349.0  3345841.0

If need add to original:

df = df.join(df1)
print (df)
        X.13-14    Y.13-14    Z.13-14    X.15-16       13-14      15-16
id                                                                     
182761  10274.0  6097173.0  5758902.0  3345841.0  11866349.0  3345841.0

Upvotes: 1

Related Questions