Reputation: 1435
I want to match two string columns and count number of (exact) matches. For example, there are two columns like
index col1 col2
0 aa ji
1 bs aa
2 qe bs
3 gd aa
col1 consists of unique ids. I want to count how many times each element of col1 occurs in col2. In other words, I would like to get an output like:
col3
2
1
0
0
in above example.
I have tried above work using pandas str.contains() and for loop, but given a large number of observations, it seems too slow and inefficient. My code looks like below.
num = []
for i in range(len(col1)):
count = col2.str.contains(col1[i]).sum()
num_replies.append(count)
Is there a time-efficient way to do this work?
Upvotes: 1
Views: 79
Reputation: 153560
Use map
and value_count
:
df['col3'] = df['col1'].map(df['col2'].value_counts()).fillna(0)
Output:
index col1 col2 col3
0 0 aa ji 2.0
1 1 bs aa 1.0
2 2 qe bs 0.0
3 3 gd aa 0.0
Upvotes: 2
Reputation: 641
try this :-
df['counts'] = df.col1.apply(lambda x: list(df.col2.values).count(x))
Upvotes: 2