user3058703
user3058703

Reputation: 591

Keeping all columns after Pandas groupby

         Postcode       Borough          Neighbourhood
283      M8Z     Etobicoke              Mimico NW
284      M8Z     Etobicoke     The Queensway West
285      M8Z     Etobicoke  Royal York South West
286      M8Z     Etobicoke         South of Bloor
287      M9Z  Not assigned           Not assigned

I have a Pandas dataframe in this format. I have used the code

Toronto =Toronto.groupby('Postcode'['Neighbourhood'].agg([('Neighbourhood', ', '.join)]).reset_index()

to group by Postcode such that Neighbourhoods are comma separated for a unique Postcode identifier. How can I modify this code so that the 'Borough' column remains in the dataframe? There's a one-to-one mapping between this and Postcode

Upvotes: 1

Views: 593

Answers (3)

Valdi_Bo
Valdi_Bo

Reputation: 30971

One of possible solutions is to add Borough to the group-by list:

Toronto.groupby(['Postcode', 'Borough']).Neighbourhood\
    .agg(', '.join).reset_index()

Upvotes: 0

user3058703
user3058703

Reputation: 591

Solved with

Toronto = (Toronto.groupby(['Postcode', 'Borough'])['Neighbourhood']
                   .agg([('Neighbourhood', ', '.join)]).reset_index())

Thanks @ALollz for the nudge

Upvotes: 3

harpan
harpan

Reputation: 8631

Since the relationship is 1:1, you can use unique and you would be fine.

df.groupby('Postcode').agg({
    'Neighbourhood': ','.join,
    'Borough': 'unique'
})

Output:

                                                          Neighbourhood     

            Borough
Postcode                                                                                   
M8Z       Mimico NW,The Queensway West,Royal York South West,South of Bloor     [Etobicoke]
M9Z                                                            Not assigned  [Not assigned]

Upvotes: 1

Related Questions