Reputation: 724

group by two columns and get unique column values in a list

I'm trying to get the unique values per name1 and name2 of column prod into a further column all_prods. If I just group by one column (for example name1) then I know how to do it but how does it work if I want to group by two columns (name1 and name2)?

df = pd.DataFrame({'name1':['x','x','x','z','z'],'name2':['x','x','x','z','z'],'prod':['c','c','f','f','f']})

df.assign(**{'all_prods': df.name1.map(df.groupby('name1')['prod'].unique())})

Upvotes: 0

Answers (3)

Khaled Koubaa

Reputation: 527

another way to do it:

df.merge(df.groupby(['name1', 'name2'])['prod'].apply(lambda x: x.unique()).reset_index(name='all_prods'), on=['name1', 'name2'], how='left')

#

    name1   name2   prod    all_prods
0   x       x       c       [c, f]
1   x       x       c       [c, f]
2   x       x       f       [c, f]
3   z       z       f       [f]
4   z       z       f       [f]

Upvotes: 0

mozway

Reputation: 260420

Variant of your original approach:

df.merge(df.groupby(['name1', 'name2'], as_index=False)
           .agg(all_prods=('prod', lambda x: list(dict.fromkeys(x)))))

output:

  name1 name2 prod all_prods
0     x     x    c    [c, f]
1     x     x    c    [c, f]
2     x     x    f    [c, f]
3     z     z    f       [f]
4     z     z    f       [f]

Upvotes: 0

BENY

Reputation: 323226

Doing transform

df['all_prods'] = df.groupby(['name1','name2'])['prod'].transform(lambda x : len(x)* [x.unique()])
df
Out[6]: 
  name1 name2 prod all_prods
0     x     x    c    [c, f]
1     x     x    c    [c, f]
2     x     x    f    [c, f]
3     z     z    f       [f]
4     z     z    f       [f]

Upvotes: 1

group by two columns and get unique column values in a list

Answers (3)

Related Questions