a1234
a1234

Reputation: 821

pandas or numpy - how to count true/false array returned

I have a dataframe called result:

find_a  id     find_b  id
yes     0001   yes     0001
no      0002   yes     0002
no      0003   no      0003
yes     0004   no      0004
yes     0005   yes     0005

I have the following:

result.find_a.values==find_b.values

Which retuns an array of True/False: array([ True, False, True, False, True])

How do I build on this and get a count of True? If I can get the count, I can then later get a percentage of matched records between the columns, ie. find_a matched with find_b 40% of the time.

Also, I'm not sure if I am venturing down the numpy or pandas route...

Thanks for the help in advance.

Upvotes: 2

Views: 872

Answers (2)

WhiteHat
WhiteHat

Reputation: 120

len(result[result.find_a == result.find_b])
np.mean(result.find_a == result.find_b)

Upvotes: 0

cs95
cs95

Reputation: 403198

Unless you are dealing with large amounts of data, it really does not matter whether you use NumPy or pandas. Since you're using pandas, I would recommend just sticking to the basics unless you know you need otherwise.

To answer your original question, you can get the % of True values using mean:

(df['find_a'] == df['find_b']).mean()
# 0.6

Where,

df['find_a'] == df['find_b']

0     True
1    False
2     True
3    False
4     True
dtype: bool

Upvotes: 4

Related Questions