Count rows across columns in a dataframe if they are greater than another value

Question

I have a data frame:

df = pd.DataFrame({ 
    'BU': ['Total', 'Total', 'Total', 'CRS', 'CRS', 'CRS'], 
    'Line_Item': ['Revenues','EBT', 'Expenses', 'Revenues', 'EBT', 'Expenses'],
    '1Q16': [100, 120, 0, 200, 190, 210],
    '2Q16': [100, 0, 130, 200, 190, 210],
    '3Q16': [200, 250, 0, 120, 0, 190]})

I want an output that counts the number of rows in 1Q16, 2Q16, 3Q16 columns that are greater than 0. So the output could be a list with entries [5, 5, 4]

I tried .loc for slicing, but since zeros appear on different rows on different columns I can't do that. I want to loop through particular columns and count entries that are greater than 0

RavinderSingh13 · Accepted Answer

With your shown samples, please try following. Using boolean masking here.

cols = ['1Q16','2Q16','3Q16']
df[cols].gt(0).sum()

Output will be as follows:

1Q16    5
2Q16    5
3Q16    4
dtype: int64

Simple explanation would be:

Creating list named cols which has column values in it(where we want to work upon).
Using gt(0) to get all elements which are greater than 0 in df's columns(given in cols list).
Then using sum function to sum all values more than 0 in given columns.

Count rows across columns in a dataframe if they are greater than another value

Answers (2)

Related Questions