Reputation: 131
I have a dataframe as follows:
name | value |
---|---|
aa | 0 |
aa | 0 |
aa | 1 |
aa | 0 |
aa | 0 |
bb | 0 |
bb | 0 |
bb | 1 |
bb | 0 |
bb | 0 |
bb | 0 |
I want to delete all rows of the dataframe when there is 1 appeared in column 'value' with relation to 'name' column.
name | value |
---|---|
aa | 0 |
aa | 0 |
aa | 1 |
bb | 0 |
bb | 0 |
bb | 1 |
What is the best way to do so? I thought about pd.groupby method and use some conditions inside, but cannot understand how to make it work.
Upvotes: 0
Views: 1058
Reputation: 151
Here's my approach on solving this.
# Imports.
import pandas as pd
# Creating a DataFrame.
df = pd.DataFrame([{'name': 'aa', 'value': 0},
{'name': 'aa', 'value': 0},
{'name': 'aa', 'value': 1},
{'name': 'aa', 'value': 0},
{'name': 'aa', 'value': 0},
{'name': 'bb', 'value': 0},
{'name': 'bb', 'value': 0},
{'name': 'bb', 'value': 1},
{'name': 'bb', 'value': 0},
{'name': 'bb', 'value': 0},
{'name': 'bb', 'value': 0},
{'name': 'bb', 'value': 0}])
# Filtering the DataFrame.
df_filtered = df.groupby('name').apply(lambda x: x[x.index <= x['value'].idxmax()]).reset_index(drop=True)
Upvotes: 2
Reputation: 631
Not the most beautiful of ways to do it but this should work.
df = df.loc[df['value'].groupby(df['name']).cumsum().groupby(df['name']).cumsum() <=1]
Upvotes: 2