Shi Jie Tio
Shi Jie Tio

Reputation: 2529

How to join the different value rows into single row for a new column?

I have a table as shown below:

Id    Family    Modal
a1     Jack      A381
a2     Jack      B674
a4    Sutyama    789b
a5    Sutyama    987y

I wish to get the below output

Id    Family    Modal   Overall
a1     Jack      A381   A381,B674
a2     Jack      B674   A381,B674
a4    Sutyama    789b   789b,987y
a5    Sutyama    987y   789b,987y

I have try on below code but it return me empty column for the Overall

df["Overall"]=df.groupby("Family")["Modal"].apply(' '.join)

Anyone have ideas?

Upvotes: 2

Views: 39

Answers (2)

BENY
BENY

Reputation: 323326

You can also fix your code by map

df["Overall"]=df.Family.map(df.drop_duplicates(['Family','Modal']).groupby("Family")["Modal"].apply(' '.join))
df
Out[45]: 
   Id   Family Modal    Overall
0  a1     Jack  A381  A381 B674
1  a2     Jack  B674  A381 B674
2  a4  Sutyama  789b  789b 987y
3  a5  Sutyama  987y  789b 987y

Upvotes: 2

cs95
cs95

Reputation: 402852

Here are my rules of thumb when applying functions with groupby:

  • To compute and return an aggregated output, use GroupBy.agg or GroupBy.apply, or
  • To broadcast an aggregated result back to the original rows, use GroupBy.transform.

This is a use case for the second rule:

df['Overall'] = df.groupby("Family")["Modal"].transform(','.join)
df

   Id   Family Modal    Overall
0  a1  Jack     A381  A381,B674
1  a2  Jack     B674  A381,B674
2  a4  Sutyama  789b  789b,987y
3  a5  Sutyama  987y  789b,987y

Upvotes: 3

Related Questions