Reputation: 3657
I have a pandas dataframe df
Fruit Apple Orange Banana Pear
basket1 0 1 10 15
basket2 1 5 7 10
basket3 10 15 0 0
I want to remove columns(fruit type) based on the following two conditions
If the sum of the fruits in basket1,basket2 and basket3 is less than 20, remove the column. The result in this case is
Fruit Orange Pear
basket1 1 15
basket2 5 10
basket3 15 0
In the above result, I want to further remove columns if the number of baskets having >0 fruit is less than 3. The result expected is
Fruit Orange
basket1 1
basket2 5
basket3 15
Can you help me to write code for this. I know how to get the sum of every fruit in each basket as df.sum(axis =0).I am unable to proceed from this point.
Upvotes: 0
Views: 218
Reputation: 214927
You can use this condition:
df.sum().gt(20)
for total sum; df.gt(0).sum().ge(3)
for positive items count.
df = df.set_index('Fruit')
df
# Apple Orange Banana Pear
#Fruit
#basket1 0 1 10 15
#basket2 1 5 7 10
#basket3 10 15 0 0
df.loc[:, df.sum().gt(20) & df.gt(0).sum().ge(3)]
# Orange
#Fruit
#basket1 1
#basket2 5
#basket3 15
Upvotes: 3