Ge0Pyth0n
Ge0Pyth0n

Reputation: 91

Difference between rows as a list within a column

I have a (messy) pd.df like this,

Number    Day        Colour   City
123       Monday     Red      London
234       Monday     Blue     Paris
234       Wednesday  Yellow   Paris
345       Tuesday    Green    Berlin

Is there an elegant way of making a new df which has the same columns but shows the differences between two rows of like ['Number'] as a list within a columns. So the desired output is something like this.

Number    Day                Colour        City
123       Monday             Red           London
234       Monday, Wednesday  Blue, Yellow  Paris
345       Tuesday            Green         Berlin

Even a point in the right direction regarding which functions I should be looking at would be helpful. The values I am interested in will always be a string rather than a float or integer. Many thanks.

Upvotes: 2

Views: 72

Answers (2)

piRSquared
piRSquared

Reputation: 294318

groupby and agg

NOTE: {*s} returns a set of all the things in s. This makes sure that we are only joining unique things and explains why we get Paris and not Paris, Paris

df.groupby('Number').agg(lambda s: ', '.join({*s}))

                      Day        Colour    City
Number                                         
123                Monday           Red  London
234     Wednesday, Monday  Blue, Yellow   Paris
345               Tuesday         Green  Berlin

OR

df.groupby('Number', as_index=False).agg(lambda s: ', '.join({*s}))

   Number                Day        Colour    City
0     123             Monday           Red  London
1     234  Wednesday, Monday  Blue, Yellow   Paris
2     345            Tuesday         Green  Berlin

Upvotes: 2

YOLO
YOLO

Reputation: 21719

Here's a way to do it:

df = (df
     .groupby('Number')
     .agg(','.join)
     .applymap(lambda x: x.split(',') if ',' in x else x)
     .reset_index())

   Number                  Day          Colour            City
0     123               Monday             Red          London
1     234  [Monday, Wednesday]  [Blue, Yellow]  [Paris, Paris]
2     345              Tuesday           Green          Berlin

Upvotes: 0

Related Questions