Roman Lents
Roman Lents

Reputation: 131

Delete some rows in dataframe based on condition in another column

I have a dataframe as follows:

name value
aa 0
aa 0
aa 1
aa 0
aa 0
bb 0
bb 0
bb 1
bb 0
bb 0
bb 0

I want to delete all rows of the dataframe when there is 1 appeared in column 'value' with relation to 'name' column.

name value
aa 0
aa 0
aa 1
bb 0
bb 0
bb 1

What is the best way to do so? I thought about pd.groupby method and use some conditions inside, but cannot understand how to make it work.

Upvotes: 0

Views: 1058

Answers (2)

George
George

Reputation: 151

Here's my approach on solving this.

# Imports.
import pandas as pd

# Creating a DataFrame.
df = pd.DataFrame([{'name': 'aa', 'value': 0},
                   {'name': 'aa', 'value': 0},
                   {'name': 'aa', 'value': 1},
                   {'name': 'aa', 'value': 0},
                   {'name': 'aa', 'value': 0},
                   {'name': 'bb', 'value': 0},
                   {'name': 'bb', 'value': 0},
                   {'name': 'bb', 'value': 1},
                   {'name': 'bb', 'value': 0},
                   {'name': 'bb', 'value': 0},
                   {'name': 'bb', 'value': 0},
                   {'name': 'bb', 'value': 0}])
# Filtering the DataFrame.
df_filtered = df.groupby('name').apply(lambda x: x[x.index <= x['value'].idxmax()]).reset_index(drop=True)

Upvotes: 2

Tom S
Tom S

Reputation: 631

Not the most beautiful of ways to do it but this should work.

df = df.loc[df['value'].groupby(df['name']).cumsum().groupby(df['name']).cumsum() <=1]

Upvotes: 2

Related Questions