aszet
aszet

Reputation: 93

Iterate over unique values in first column while counting other columns within the value of the first column

I have some data that looks like this:

agent external id,product_group,commission_rate,weeks_held
3000-29,Collection,0.85,1
3000-29,Collection,0.85,2
3000-29,Return,0.85,1
3000-12,Collection,0.85,1
3000-12,Collection,0.85,2
3000-12,Return,0.85,1
3000-34,Collection,0.8,2
3000-34,Collection,0.8,2
3000-34,Return,0.8,1
3022-29,Collection,0.75,1
3022-29,Collection,0.75,2
3022-29,Return,0.75,1

I'm trying to create a dataframe that loops over each agent external id and gives me each as separate columns:

The output I'm looking for is:

agent_external_id,count_collection,sum_collection_weeks_held,count_return,sum_return_weeks_held
3000-29,2,3,0,0
3000-29,0,0,1,1
3000-12,2,3,0,0
3000-12,0,0,1,1
...

I've tried all types of groupby() but I'm not able to structure the data in the way that I want.

Upvotes: 0

Views: 24

Answers (1)

wwnde
wwnde

Reputation: 26686

Is this what you would like. Please give a sample output and someone will help.

df.groupby(['agent_external_id','product_group']).agg(['sum','count'])



                                       commission_rate       weeks_held      
                                            sum  count       sum  count
agent_external_id product_group                                       
3000-12           Collection               1.70     2          3     2
                  Return                   0.85     1          1     1
3000-29           Collection               1.70     2          3     2
                  Return                   0.85     1          1     1
3000-34           Collection               1.60     2          4     2
                  Return                   0.80     1          1     1
3022-29           Collection               1.50     2          3     2
                  Return                   0.75     1          1     1

Upvotes: 1

Related Questions