Reputation: 18547
I want to set item with count<=1 as others, code for input table:
import pandas as pd
df=pd.DataFrame({"item":['a','a','a','b','b','c','d']})
input table:
item
0 a
1 a
2 a
3 b
4 b
5 c
6 d
expected output:
item result
0 a a
1 a a
2 a a
3 b b
4 b b
5 c other
6 d other
How could I achieve that?
Upvotes: 2
Views: 456
Reputation: 34086
Use numpy.where
with Groupby.transform
and Series.le
:
In [926]: import numpy as np
In [927]: df['result'] = np.where(df.groupby('item')['item'].transform('count').le(1), 'other', df.item)
In [928]: df
Out[928]:
item result
0 a a
1 a a
2 a a
3 b b
4 b b
5 c other
6 d other
OR use Groupby.size
with merge
:
In [917]: x = df.groupby('item').size().reset_index()
In [919]: ans = df.merge(x)
In [921]: ans['result'] = np.where(ans[0].le(1), 'other', ans.item)
In [923]: ans = ans.drop(0, 1)
In [924]: ans
Out[924]:
item result
0 a a
1 a a
2 a a
3 b b
4 b b
5 c other
6 d other
Upvotes: 1
Reputation: 863741
Use Series.where
with check if all values are duplciates by Series.duplicated
with keep=False
:
df['result'] = df.item.where(df.item.duplicated(keep=False), 'other')
Or use GroupBy.transform
with greater by 1 by Series.gt
:
df['result'] = df.item.where(df.groupby('item')['item'].transform('size').gt(1), 'other')
Or use Series.map
with Series.value_counts
:
df['result'] = df.item.where(df['item'].map(df['item'].value_counts()).gt(1), 'other')
print (df)
item result
0 a a
1 a a
2 a a
3 b b
4 b b
5 c other
6 d other
Upvotes: 2