rhaskett
rhaskett

Reputation: 1962

Pandas GroupBy on column names

I have a dataframe, we can proxy by

df = pd.DataFrame({'a':[1,0,0], 'b':[0,1,0], 'c':[1,0,0], 'd':[2,3,4]})

and a category series

category = pd.Series(['A', 'B', 'B', 'A'], ['a', 'b', 'c', 'd'])

I'd like to get a sum of df's columns grouped into the categories 'A', 'B'. Maybe something like:

result = df.groupby(??, axis=1).sum()

returning

result = pd.DataFrame({'A':[3,3,4], 'B':[1,1,0]})

Upvotes: 3

Views: 7803

Answers (3)

Ganesh Kharad
Ganesh Kharad

Reputation: 341

Here what i did to group dataframe with similar column names

data_df:

1    1    2   1

q    r   f    t

Code:

df_grouped = data_df.groupby(data_df.columns, axis=1).agg(lambda x: ' '.join(x.values))

df_grouped:

1       2
q r t   f

Upvotes: 0

BENY
BENY

Reputation: 323226

After reindex you can assign the category to the column of df

df=df.reindex(columns=category.index)
df.columns=category
df.groupby(df.columns.values,axis=1).sum()
Out[1255]: 
   A  B
0  3  1
1  3  1
2  4  0

Or pd.Series.get

df.groupby(category.get(df.columns),axis=1).sum()
Out[1262]: 
   A  B
0  3  1
1  3  1
2  4  0

Upvotes: 3

cs95
cs95

Reputation: 402323

Use groupby + sum on the columns (the axis=1 is important here):

df.groupby(df.columns.map(category.get), axis=1).sum()

   A  B
0  3  1
1  3  1
2  4  0

Upvotes: 5

Related Questions