Reputation: 1731
given
patient_id test_result has_cancer
0 79452 Negative False
1 81667 Positive True
2 76297 Negative False
3 36593 Negative False
4 53717 Negative False
5 67134 Negative False
6 40436 Negative False
how to count False or True in a column , in python?
I had been trying:
# number of patients with cancer
number_of_patients_with_cancer= (df["has_cancer"]==True).count()
print(number_of_patients_with_cancer)
Upvotes: 90
Views: 210998
Reputation: 402413
If has_cancer
has NaNs:
false_count = (~df.has_cancer).sum()
If has_cancer
does not have NaNs, another option is to subtract from the length of the dataframe and avoid negation. Not necessarily better than the previous approach.
false_count = len(df) - df.has_cancer.sum()
And similarly, if you want just the count of True values, that is
true_count = df.has_cancer.sum()
If you want both, it is
fc, tc = df.has_cancer.value_counts().sort_index().tolist()
Upvotes: 66
Reputation: 2056
Count True
:
df["has_cancer"].sum()
Count False
:
(~df["has_cancer"]).sum()
See Boolean operators.
Upvotes: 0
Reputation: 323226
So you need value_counts
?
df.col_name.value_counts()
Out[345]:
False 6
True 1
Name: has_cancer, dtype: int64
Upvotes: 112
Reputation: 193
Consider your above data frame as a df
True_Count = df[df.has_cancer == True]
len(True_Count)
Upvotes: 1
Reputation: 81
number_of_patients_with_cancer = df.has_cancer[df.has_cancer==True].count()
Upvotes: 8
Reputation: 11651
Just sum the column for a count of the Trues. False is just a special case of 0 and True a special case of 1. The False count would be your row count minus that. Unless you've got na
's in there.
Upvotes: 0
Reputation: 517
0 True
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
If the panda series above is called example
example.sum()
Then this code outputs 1 since there is only one True
value in the series. To get the count of False
len(example) - example.sum()
Upvotes: 16