Reputation: 10033
I am filtering a pandas dataframe based on one or more conditions, like so:
def filter_dataframe(dataframe, position=None, team_id=None, home=None, window=None, min_games=0):
df = dataframe.copy()
if position:
df = df[df['position_id'] == position]
if clube_id:
df = df[df['team_id'] == team_id]
if home:
if home == 'home':
df = df[df['home_dummy'] == 1.0]
elif home == 'away':
df = df[df['home_dummy'] == 0.0]
if window:
df = df[df['round_id'].between(1, window)]
if min_games:
df = df[df['games_num'] >= min_games]
return df
But I don't think this is elegant.
Is there a simpler way of achieving the same result?
I though of creating rules for conditions like in this SO answer and then use the method any(rules)
in order to apply the filtering, if any, but I don't know how to approach this. Any ideas?
Upvotes: 1
Views: 138
Reputation: 13458
You could try something like this:
def filter_dataframe(dataframe, position=None, clube_id=None, team_id=None, home=None, window=None, min_games=0):
df = dataframe.copy()
masks = {
"mask1": [position is not None, df[df["position_id"] == position]],
"mask2": [clube_id is not None, df[df["team_id"] == team_id]],
"mask3": [home == "home", df[df["home_dummy"] == 1.0]],
"mask4": [home == "away", df[df["home_dummy"] == 0.0]],
"mask5": [window is not None, df[df["round_id"].between(1, window)]],
"mask6": [min_games is not None, df[df["games_num"] >= min_games]],
}
for value in masks.values():
if value[0]:
df = value[1]
return df
Upvotes: 1