Create new column by grouping a single column in pandas dataframe

Question

I have a dataframe

df1 = pd.DataFrame({'value': [100, 100, 100, 200, 300, 300, 400],
                    'owner': list('aabbaba')})
 value    owner
  100       a
  100       a
  100       b
  200       b
  300       a
  300       b
  400       a

I need to groupby the value column and produce a new output such that no duplicates exist in the value column. A value that is linked to both a and b should have an owner value of both while a value with only a single owner should just have that owner.

Output desired:

value    owner
 100     both
 200     b
 300     both
 400     a

Thanks for the help

yatu · Accepted Answer

One approach is cheking the amount of unique values within each group, and use np.where to replace with either both or the unique value contained in a group:

import numpy as np

out = df.groupby('value', as_index=False).agg({'owner':'nunique'})
out['owner'] = np.where(out.owner.eq(2), 
                        'both', 
                        df.groupby('value').owner.first())

print(out)

   value owner
0    100  both
1    200     b
2    300  both
3    400     a

Create new column by grouping a single column in pandas dataframe

Answers (1)

Related Questions