Constantine1991
Constantine1991

Reputation: 111

Grouping and getting count for multiple columns

I have a dataframe like

column_one  columnn_two     type    column_three    
apple       headphones      one         yes
apple       headphones      two         yes
apple       tv              one         no
apple       iPhones         two         yes
apple       iPad            one         no
apple       iPad            two         no

I wanna group on multiple rows and get their count like

column_one  columnn_two     yes         no  
apple       headphones      2           0
apple       tv              0           1
apple       iPhones         1           0
apple       iPad            0           2

I am aware about how to do groupby but not sure how do I count for multiple rows and convert rows into columns to get the count.

Upvotes: 0

Views: 49

Answers (1)

ilja
ilja

Reputation: 2692

Probably not the most efficient way, but maybe it still helps :-)

I used a custom aggregation function sum_col_three(x) via apply() and converted the results to a new column via to_frame(). Afterwards I splitted the tuple into two separate columns with a new DataFrame and tolist():

def sum_col_three(x):
    return sum(x['column_three']=='yes'), sum(x['column_three']=='no')

df = df.groupby(['column_one', 'column_two']).apply(sum_col_three).to_frame('yes')
df[['yes', 'no']] = pd.DataFrame(df['yes'].tolist(), index=df.index) 

df

>>                           yes    no
>>column_one    column_two      
>>apple         headphones   2      0
>>              iPad         0      2
>>              iPhones      1      0
>>              tv           0      1

Upvotes: 2

Related Questions