Reputation: 2904
I'm having two dataframes like below-
dataframe 1-
df1_data = {'sym' :{0:'AAA',1:'BBB',2:'CCC',3:'DDD',4:'EEE',5:'FFF'}}
df = pd.DataFrame(df1_data)
dataframe 2-
df2_data = {'sym1' :{0:'AAA',1:'BB',2:'XXX',3:'A',4:'CCC',5:'D',6:'EEE',7:'EEE',8:'FFF'}}
df2 = pd.DataFrame(df2_data)
I want to check sym column available in dataframe 1 is how many times in sym1 column of dataframe 2?
Expected Result -
sym,count
0 AAA,1
1 BBB,0
2 CCC,1
3 DDD,0
4 EEE,2
5 FFF,1
Upvotes: 2
Views: 494
Reputation: 294278
numpy
broadcastingdf.assign(count=(df.sym.values[:, None] == df2.sym1.values).sum(1))
sym count
0 AAA 1
1 BBB 0
2 CCC 1
3 DDD 0
4 EEE 2
5 FFF 1
Upvotes: 4
Reputation: 76917
Here's one way by join
ing on df
with sym
index on df2
's value_counts()
, then fill NaN
, and for view's sake, use reset_index()
In [113]: df.set_index('sym').join(df2.sym1.value_counts()).fillna(0).reset_index()
Out[113]:
sym sym1
0 AAA 1.0
1 BBB 0.0
2 CCC 1.0
3 DDD 0.0
4 EEE 2.0
5 FFF 1.0
Details
In [114]: df2.sym1.value_counts()
Out[114]:
EEE 2
CCC 1
XXX 1
FFF 1
BB 1
D 1
AAA 1
A 1
Name: sym1, dtype: int64
Or, using map
In [141]: df['count'] = df['sym'].map(df2.sym1.value_counts()).fillna(0)
In [142]: df
Out[142]:
sym count
0 AAA 1.0
1 BBB 0.0
2 CCC 1.0
3 DDD 0.0
4 EEE 2.0
5 FFF 1.0
Upvotes: 4
Reputation: 862671
Solution with value_counts
and reindex
, values in df['sym']
has to be unique.
print (df2.sym1.value_counts().reindex(df['sym'], fill_value=0).reset_index())
sym sym1
0 AAA 1
1 BBB 0
2 CCC 1
3 DDD 0
4 EEE 2
5 FFF 1
Upvotes: 2
Reputation: 1838
Here is a solution where you need to iterate df and df2 only once:
from collections import defaultdict
vals_dict = defaultdict(list, df2.groupby('sym1').indices)
df['count'] = df['sym'].apply(lambda x: len(vals_dict[x]))
print df
It will add to df a new column with the requested values:
sym count
0 AAA 1
1 BBB 0
2 CCC 1
3 DDD 0
4 EEE 2
5 FFF 1
Upvotes: 0
Reputation: 2428
one simple way would be to:
df['count'] = df['sym'].map(lambda x:list(df2['sym1'].values).count(x))
hope it helps
Upvotes: 0