Help needed
Help needed

Reputation: 35

Counting selected dataframe columns according to condition

Although this question seems somewhat similar to previous ones, I could not have it solved with previous answers and I need help from experts.

I am trying to create a column (e.g. 'Result') with the count of other columns with labels that start with 'X_', given a condition (eg. column element >1).

df1 = pd.DataFrame(np.array([[1, 0, 2], [2, 3, 1], [3, 0, 3]]), columns=['A', 'X_1', 'X_2'])
df1

    A   X_1 X_2
0   1   0   2
1   2   3   1
2   3   0   3

The desired output is:

df1
    A   X_1 X_2 Result
0   1   0   2   1
1   2   3   1   2
2   3   0   3   1

Can you please help me?

A novice Panda user

Upvotes: 1

Views: 36

Answers (2)

U13-Forward
U13-Forward

Reputation: 71570

Try with gt and sum:

>>> df1['Result'] = df1.gt(0).sum(1) - df1['A'].gt(0)
>>> df1
   A  X_1  X_2  Result
0  1    0    2       1
1  2    3    1       2
2  3    0    3       1
>>> 

Upvotes: 1

Henry Ecker
Henry Ecker

Reputation: 35626

We can filter the DataFrame for columns that start with X_ test which values are ge (greater than or equal to) 1 and sum to count the number of values that are True for each row (axis=1):

df1['Result'] = df1.filter(regex='^X_').ge(1).sum(axis=1)

df1:

   A  X_1  X_2  Result
0  1    0    2       1
1  2    3    1       2
2  3    0    3       1

Upvotes: 2

Related Questions