Reputation: 820
I have a dataframe
df1 = pd.DataFrame({'value': [100, 100, 100, 200, 300, 300, 400],
'owner': list('aabbaba')})
value owner
100 a
100 a
100 b
200 b
300 a
300 b
400 a
I need to groupby the value column and produce a new output such that no duplicates exist in the value
column. A value that is linked to both a and b should have an owner
value of both
while a value with only a single owner should just have that owner.
Output desired:
value owner
100 both
200 b
300 both
400 a
Thanks for the help
Upvotes: 1
Views: 46
Reputation: 88276
One approach is cheking the amount of unique values within each group, and use np.where
to replace with either both
or the unique value contained in a group:
import numpy as np
out = df.groupby('value', as_index=False).agg({'owner':'nunique'})
out['owner'] = np.where(out.owner.eq(2),
'both',
df.groupby('value').owner.first())
print(out)
value owner
0 100 both
1 200 b
2 300 both
3 400 a
Upvotes: 2