Reputation: 833
I have a dataframe that describes whether work sites have tools. It has a simple true/false for describing if the tool is present. Unfortunately, the tool names are not standardized but I do have a list of them.
The dataframe looks like this:
+-------+---------+-------+-------+-------+-------+
| Index | County | State | JDKFK | YSAFQ | MDFKN |
+-------+----_----+-------+-------+-------+-------+
| 0 | '001' | '05' | TRUE | TRUE | TRUE |
| 1 | '030' | '05' | FALSE | TRUE | FALSE |
| 2 | '005' | '05' | TRUE | TRUE | FALSE |
+-------+---------+-------+-------+-------+-------+
Using my list, how can I produce a new column that has the sum of TRUE in the specified columns?
cols = ['JDKFK ', 'YSAFQ ', 'MDFKN ']
+-------+--------+-------+-------+-------+-------+----------+
| Index | County | State | JDKFK | YSAFQ | MDFKN | Tool_Sum |
+-------+--------+-------+-------+-------+-------+----------+
| 0 | '01' | '05' | TRUE | TRUE | TRUE | 3 |
| 1 | '01' | '05' | FALSE | TRUE | FALSE | 1 |
| 2 | '01' | '05' | TRUE | TRUE | FALSE | 2 |
+-------+--------+-------+-------+-------+-------+----------+
Count occurrences of False or True in a column in pandas does not answer my question because that post defines how to count True/False in a single column (vertical) whereas my question is how to count True/False across multiple columns (horizontal).
Upvotes: 1
Views: 1813
Reputation: 61910
You could use sum:
df['tool_sum'] = df[['JDKFK', 'YSAFQ', 'MDFKN']].sum(1)
print(df)
Output
Index County State JDKFK YSAFQ MDFKN tool_sum
0 0 '01' '05' True True True 3
1 1 '01' '05' False True False 1
2 2 '01' '05' True True False 2
The sum(1)
means that is going to sum across the second axis (i.e 1
because it counts from 0
). Another way to see it sum across columns.
Upvotes: 3