As3adTintin
As3adTintin

Reputation: 2476

Pandas .groupby(): aggregation to include grouping variable

My Data:

a_1, a_2, b_1, b_2, ...
0,   0,   1,   0,  ...
1,   0,   0,   1,  ...
1,   1,   1,   0,  ...
0,   1,   0,   0,  ...
etc...

I want to sum all the rows for each column, looping through if a_1 == 1 then if b_1 == 1 then if c_1 == 1 etc.

right now I have testDict = {k : df[df[k + '_1']==1].groupby(k + '_1').sum() for k in letters}

However, this sums all the columns except the column I am grouping by... which I also want the sum for. Any thoughts or suggestions?

The output should look like:

testDict['a'] : 
a_1, a_2, b_1, b_2,  ...
2,   1,   1,   1, ...

testDict['b'] :
a_1, a_2, b_1, b_2,  ...
1,   1,   2,   0,  ....

Thank you.

Upvotes: 1

Views: 78

Answers (1)

As3adTintin
As3adTintin

Reputation: 2476

Oh whoops I totally missed this. I can just use testDict = {k : df[df[k + '_1']==1].sum() for k in letters} with no groupby! durrrr.

I ended up with testDict = {k : pd.DataFrame(df[df[k + '_1']==1].sum()).transpose() for k in letters} to maintain the horizontal layout (.sum() created a vertical layout)

Upvotes: 1

Related Questions