Reputation: 821
I have a dataframe called result
:
find_a id find_b id
yes 0001 yes 0001
no 0002 yes 0002
no 0003 no 0003
yes 0004 no 0004
yes 0005 yes 0005
I have the following:
result.find_a.values==find_b.values
Which retuns an array of True/False:
array([ True, False, True, False, True])
How do I build on this and get a count of True
? If I can get the count, I can then later get a percentage of matched records between the columns, ie. find_a
matched with find_b
40% of the time.
Also, I'm not sure if I am venturing down the numpy or pandas route...
Thanks for the help in advance.
Upvotes: 2
Views: 872
Reputation: 120
len(result[result.find_a == result.find_b])
np.mean(result.find_a == result.find_b)
Upvotes: 0
Reputation: 403198
Unless you are dealing with large amounts of data, it really does not matter whether you use NumPy or pandas. Since you're using pandas, I would recommend just sticking to the basics unless you know you need otherwise.
To answer your original question, you can get the % of True
values using mean
:
(df['find_a'] == df['find_b']).mean()
# 0.6
Where,
df['find_a'] == df['find_b']
0 True
1 False
2 True
3 False
4 True
dtype: bool
Upvotes: 4