Reputation: 35646
I have a pandas DataFrame
like following.
id label_x label_y
0 1 F R
1 2 F F
2 3 F F
3 4 F F
4 5 F F
Now I want to count occurrences of label_x and label_y are equal and not equal. In this case there is only one occurrence of not equal and 4 occurrences of equal.
df = pd.DataFrame({'id' : ["1","2","3","4","5"],
'label_x' : ["F","F","F","F","F"], 'label_y' : ["R","F","F","F","F"]})
Upvotes: 2
Views: 4835
Reputation: 35646
I came up with this solution. Is that the best one?
def compare(x):
if x[1] == x[2]:
return 'yes'
else:
return 'no'
df['result'] = df.apply(compare, axis=1)
df2 = pd.DataFrame({'count' : df.groupby( ["result"] ).size()}).reset_index()
Upvotes: 1
Reputation: 13955
(df.label_x == df.label_y).value_counts()
Many ways to to that, including the above...
In [43]: (df.label_x == df.label_y).value_counts()
Out[43]:
True 4
False 1
dtype: int64
Upvotes: 2