Satya
Satya

Reputation: 5907

finding common value in one column against distinct value in another column in pandas

i have a dataframe like

event    cust
 et1   satya
 et1    papu
 et1     abc
 et1   satya
 et1     def
 et2    papu
 et2   satya
 et2   panda
 et3  normal
 et3   panda
 et3   satya
 et3     fgh

Now i need to findout 'cust' who exist for all 3 type of events.so this should yield

event  cust
et1  satya
et1  satya

Don't worry about distinct(duplicates can be dropped). For this my approach is

x  = df[df['event'] == 'et1']
y  = df[df['event'] == 'et2']
z  = df[df['event'] == 'et3']
df_common = x[x['cust'].isin(y[y['cust'].isin(z.cust)]['cust'])]

But this will not appropriate in the case, when the DataFrame size will be huge and i have to find common cust for some 50-100+ events.

Please suggest some pandas/more-pythonic way to do so.Thanks in advance.

Upvotes: 1

Views: 39

Answers (1)

jezrael
jezrael

Reputation: 862521

You can try:

#first drop duplicates in each group by event
df = df.drop_duplicates(['event','cust'])

#count  values
counts = df.cust.value_counts()
print counts
satya     3
panda     2
papu      2
def       1
normal    1
fgh       1
abc       1
Name: cust, dtype: int64

#get number of unique events
uniqevents = df.event.nunique()
print uniqevents
3
#get values with count == uniqevents
counts = counts[counts == uniqevents]
print counts
satya    3
Name: cust, dtype: int64

print counts.index.to_series().reset_index(drop=True)
0    satya
dtype: object

Upvotes: 1

Related Questions