ZK Zhao
ZK Zhao

Reputation: 21583

Python & Pandas: how to elegantly filter many dataframes?

I know I can use isin to filter dataframe

my question is that, when I'm doing this for many dataframes, the code looks a bit repetitive.

For example, below is how I filter some datasets to limit only to specific user datasets.

## filter data
df_order_filled  = df_order_filled[df_order_filled.user_id.isin(df_user.user_id)]
df_liquidate_order = df_liquidate_order[df_liquidate_order.user_id.isin(df_user.user_id)]
df_fee_discount_ = df_fee_discount_[df_fee_discount_.user_id.isin(df_user.user_id)]
df_dep_wit = df_dep_wit[df_dep_wit.user_id.isin(df_user.user_id)]

the name of the dataframe is repeated 3 times for each df, which is kind unnecessary.

How can I simplify my code?

Thanks!

Upvotes: 1

Views: 52

Answers (1)

jezrael
jezrael

Reputation: 863531

Use list comprehension with list of DataFrames:

dfs = [df_order_filled, df_liquidate_order, df_fee_discount_, df_dep_wit]

dfs1 = [x[x.user_id.isin(df_user.user_id) for x in dfs]

Output is another list with filtered DataFrames.

Another similar idea is use dictionary:

dict1 = {'df_order_filled': df_order_filled, 
         'df_liquidate_order': df_liquidate_order, 
         'df_fee_discount':df_fee_discount, 
         'df_dep_wit':df_dep_wit}

dict2 = {k: x[x.user_id.isin(df_user.user_id) for k, x in dict1.items()}

Upvotes: 1

Related Questions