Reputation: 21583
I know I can use isin
to filter dataframe
my question is that, when I'm doing this for many dataframes, the code looks a bit repetitive.
For example, below is how I filter some datasets to limit only to specific user
datasets.
## filter data
df_order_filled = df_order_filled[df_order_filled.user_id.isin(df_user.user_id)]
df_liquidate_order = df_liquidate_order[df_liquidate_order.user_id.isin(df_user.user_id)]
df_fee_discount_ = df_fee_discount_[df_fee_discount_.user_id.isin(df_user.user_id)]
df_dep_wit = df_dep_wit[df_dep_wit.user_id.isin(df_user.user_id)]
the name of the dataframe is repeated 3 times for each df
, which is kind unnecessary.
How can I simplify my code?
Thanks!
Upvotes: 1
Views: 52
Reputation: 863531
Use list comprehension with list of DataFrames:
dfs = [df_order_filled, df_liquidate_order, df_fee_discount_, df_dep_wit]
dfs1 = [x[x.user_id.isin(df_user.user_id) for x in dfs]
Output is another list with filtered DataFrames.
Another similar idea is use dictionary:
dict1 = {'df_order_filled': df_order_filled,
'df_liquidate_order': df_liquidate_order,
'df_fee_discount':df_fee_discount,
'df_dep_wit':df_dep_wit}
dict2 = {k: x[x.user_id.isin(df_user.user_id) for k, x in dict1.items()}
Upvotes: 1