Reputation: 35
Although this question seems somewhat similar to previous ones, I could not have it solved with previous answers and I need help from experts.
I am trying to create a column (e.g. 'Result') with the count of other columns with labels that start with 'X_', given a condition (eg. column element >1).
df1 = pd.DataFrame(np.array([[1, 0, 2], [2, 3, 1], [3, 0, 3]]), columns=['A', 'X_1', 'X_2'])
df1
A X_1 X_2
0 1 0 2
1 2 3 1
2 3 0 3
The desired output is:
df1
A X_1 X_2 Result
0 1 0 2 1
1 2 3 1 2
2 3 0 3 1
Can you please help me?
A novice Panda user
Upvotes: 1
Views: 36
Reputation: 71570
Try with gt
and sum
:
>>> df1['Result'] = df1.gt(0).sum(1) - df1['A'].gt(0)
>>> df1
A X_1 X_2 Result
0 1 0 2 1
1 2 3 1 2
2 3 0 3 1
>>>
Upvotes: 1
Reputation: 35626
We can filter
the DataFrame for columns that start with X_
test which values are ge
(greater than or equal to) 1 and sum
to count the number of values that are True
for each row (axis=1
):
df1['Result'] = df1.filter(regex='^X_').ge(1).sum(axis=1)
df1
:
A X_1 X_2 Result
0 1 0 2 1
1 2 3 1 2
2 3 0 3 1
Upvotes: 2