Reputation: 5907
i have a dataframe like
event cust
et1 satya
et1 papu
et1 abc
et1 satya
et1 def
et2 papu
et2 satya
et2 panda
et3 normal
et3 panda
et3 satya
et3 fgh
Now i need to findout 'cust' who exist for all 3 type of events.so this should yield
event cust
et1 satya
et1 satya
Don't worry about distinct(duplicates can be dropped). For this my approach is
x = df[df['event'] == 'et1']
y = df[df['event'] == 'et2']
z = df[df['event'] == 'et3']
df_common = x[x['cust'].isin(y[y['cust'].isin(z.cust)]['cust'])]
But this will not appropriate in the case, when the DataFrame size will be huge and i have to find common cust for some 50-100+ events.
Please suggest some pandas/more-pythonic way to do so.Thanks in advance.
Upvotes: 1
Views: 39
Reputation: 862521
You can try:
#first drop duplicates in each group by event
df = df.drop_duplicates(['event','cust'])
#count values
counts = df.cust.value_counts()
print counts
satya 3
panda 2
papu 2
def 1
normal 1
fgh 1
abc 1
Name: cust, dtype: int64
#get number of unique events
uniqevents = df.event.nunique()
print uniqevents
3
#get values with count == uniqevents
counts = counts[counts == uniqevents]
print counts
satya 3
Name: cust, dtype: int64
print counts.index.to_series().reset_index(drop=True)
0 satya
dtype: object
Upvotes: 1