jovicbg
jovicbg

Reputation: 1553

Store unique values from group to another column in Pandas

I have a problem with getting unique values from one column to another as value.

df I have:

id  value1 valueNo
1    12      140
1    13      149
1    11      149
2    11      nan
2    11      150
3    15      145
3    12      149

Desired output would be

id  value1 valueNo   uniqueNo
1    12      140      140, 149
1    13      149      140, 149
1    11      149      140, 149
2    11      nan      150
2    11      150      150
3    15      145      145, 149
3    12      149      145, 149

I have tried with a few approaches but nothing works for me.

df['uniqueNo']=df.groupby(['id'])['valueNo'].apply(lambda x: x.unique())
d['uniqueNo'] = df.groupby(['id'])['valueNo'].apply(list)

Upvotes: 1

Views: 329

Answers (1)

jezrael
jezrael

Reputation: 862581

If missing values are not problem use GroupBy.transform with unique:

df['uniqueNo']=df.groupby(['id'])['valueNo'].transform('unique')
print (df)
   id  value1  valueNo        uniqueNo
0   1      12    140.0  [140.0, 149.0]
1   1      13    149.0  [140.0, 149.0]
2   1      11    149.0  [140.0, 149.0]
3   2      11      NaN    [nan, 150.0]
4   2      11    150.0    [nan, 150.0]
5   3      15    145.0  [145.0, 149.0]
6   3      12    149.0  [145.0, 149.0]

If need remove them solution is first remove them, aggregate unique and map to new column:

s = df.dropna(subset=['valueNo'])['valueNo'].astype(int).groupby(df['id']).unique()
#if converting to intgers is not necessary
#s = df.dropna(subset=['valueNo']).groupby('id')['valueNo'].unique()
df['uniqueNo'] = df['id'].map(s)
print (df)
   id  value1  valueNo    uniqueNo
0   1      12    140.0  [140, 149]
1   1      13    149.0  [140, 149]
2   1      11    149.0  [140, 149]
3   2      11      NaN       [150]
4   2      11    150.0       [150]
5   3      15    145.0  [145, 149]
6   3      12    149.0  [145, 149]

Upvotes: 2

Related Questions