Dataframe count of positive values in range as a new column

Question

I have a dataframe:

df = pd.DataFrame(np.random.randn(10, 3), columns=list('XYZ'))
df.insert(0, 'NAME', pd.Series(list('ABCDEFGHIJ')))

and would like to have the count of positive entries in specified columns ('X', 'Y', 'Z') as a new column to the dataframe.

What's the best way of doing this?

cmaher · Accepted Answer

Here's one way to do it:

df['COUNT'] = df.select_dtypes(include='float64').gt(0).sum(axis=1)
#  NAME         X         Y         Z  COUNT
# 0    A -0.033066 -1.064625 -0.299286      0
# 1    B  0.902976 -1.703256 -0.011417      1
# 2    C -2.537364 -0.216643  1.051398      1
# 3    D  1.073677 -1.486599 -0.827829      1
# 4    E  2.157901  0.425371 -1.659263      2
# 5    F -1.589662 -0.382535  0.454324      1
# 6    G  0.487965  0.279265  0.820486      3
# 7    H  0.496104 -0.680161  0.763793      2
# 8    I -0.034518 -0.479307 -0.071954      0
# 9    J -0.170412  0.558505 -1.742784      1

The select_dtypes method is pretty self-explanatory, but it's useful in cases like this for filtering to columns of a certain dtype without needing to worry about column names.

The .gt method (documentation) tests Series values for being greater than the argument value (in this case 0), and returns boolean values. We can then calculate the row-wise sum of True values to get the count of values meeting our criterion.

Dataframe count of positive values in range as a new column

Answers (2)

Related Questions