Reputation: 661
I have a data frame:
df = pd.DataFrame({
'BU': ['Total', 'Total', 'Total', 'CRS', 'CRS', 'CRS'],
'Line_Item': ['Revenues','EBT', 'Expenses', 'Revenues', 'EBT', 'Expenses'],
'1Q16': [100, 120, 0, 200, 190, 210],
'2Q16': [100, 0, 130, 200, 190, 210],
'3Q16': [200, 250, 0, 120, 0, 190]})
I want an output that counts the number of rows in 1Q16, 2Q16, 3Q16 columns that are greater than 0. So the output could be a list with entries [5, 5, 4]
I tried .loc
for slicing, but since zeros appear on different rows on different columns I can't do that. I want to loop through particular columns and count entries that are greater than 0
Upvotes: 1
Views: 127
Reputation: 133508
With your shown samples, please try following. Using boolean masking here.
cols = ['1Q16','2Q16','3Q16']
df[cols].gt(0).sum()
Output will be as follows:
1Q16 5
2Q16 5
3Q16 4
dtype: int64
Simple explanation would be:
cols
which has column values in it(where we want to work upon).gt(0)
to get all elements which are greater than 0
in df's columns(given in cols list).sum
function to sum all values more than 0 in given columns.Upvotes: 2
Reputation: 195438
lst = df.filter(regex=r"\dQ\d+").gt(0).sum().tolist()
print(lst)
Prints:
[5, 5, 4]
Or:
lst = (df.loc[:, "1Q16":] > 0).sum().tolist()
Upvotes: 1