Reputation: 891
I need to use a dictionary as a filter on a large dataframe, where the key-value pairs are values in different columns.
This dictionary is obtained from a separate dataframe, using dict(zip(df.id_col, df.rank_col))
so if a dictionary isn't the best way to go, that is open to change.
This is very similar to this question: Filter a pandas dataframe using values from a dict but fundamentally (I think) different because my dictionary contains column-paired values:
df_x = pd.DataFrame({'id':[1,1,1,2,2,2,3,3,3],
'B':[1,1,1,0,1,0,1,0,1], 'Rank':['1','2','3','1', '2','3','1','2','3'],'D':[1,2,3,4,5,6,7,8,9]})
filter_dict = {'1':'1', '2':'3', '3':'2'}
For this dataframe df_x
I would want to be able to look at the filter dictionary and apply it to a set of columns, here id
and Rank
, so the dataframe is pared down to:
The actual source dataframe is approx 1M rows, and the dictionary is >100 key-value pairs. Thanks for any help.
Upvotes: 2
Views: 2581
Reputation: 323226
You can check with isin
df_x[df_x[['id','Rank']].astype(str).apply(tuple,1).isin(filter_dict.items())]
Out[182]:
id B Rank D
0 1 1 1 1
5 2 0 3 6
7 3 0 2 8
Upvotes: 5