Filter one dataframe using another through a loop pandas

Question

I have two dataframes df1 and df2. df1 as two columns with distinct values across multiple rows. It looks similar to the following:

col_a	col_b
aa	50
bb	132
dd	543

df2 has similar structure to the following:

col_a	col__b	col_x	col_y	col_z
aa	xy	xy	xy	2
aa	xy	xy	xy	3
bb	xy	xy	xy	14
bb	xy	xy	xy	9
bb	xy	xy	xy	6
cc	xy	xy	xy	0
cc	xy	xy	xy	2
dd	xy	xy	xy	0
dd	xy	xy	xy	2

I want to filter df2 through a loop using values from df1 in a way that after first iteration of loop I end up with the follow:

col_a	col_b	col_x	col_y	col_z
aa	xy	xy	xy	2
aa	xy	xy	xy	3

And after second iteration I should have the following:

col_a	col_b	col_x	col_y	col_z
bb	xy	xy	xy	14
bb	xy	xy	xy	9
bb	xy	xy	xy	6

I do not want the filtering to modify the df2 permanently. Subsequent iterations should filter df2 based on values from col_a of df1.

I have no idea how to achieve this so I would appreciate any help.

Hammurabi · Accepted Answer

You can iterate through your unique column values, then filter your df2 based on each unique value. You don't need drop_duplicates if df1 values are unique, but I include it just in case in this toy example:

df1 = pd.DataFrame({'col_a':['aa', 'bb']})
df2 = pd.DataFrame({'col_a':['aa', 'aa', 'bb', 'bb'], 'other column':[1, 2, 3, 4]})

for col_value in df1['col_a'].drop_duplicates():
    df_temp_filtered = df2[df2['col_a'] == col_value]

to get this df_temp_filtered in round 2:

  col_a  other column
2    bb             3
3    bb             4

Filter one dataframe using another through a loop pandas

Answers (1)

Related Questions

col_a	col__b	col_x	col_y	col_z
aa	xy	xy	xy	2
aa	xy	xy	xy	3
bb	xy	xy	xy	14
bb	xy	xy	xy	9
bb	xy	xy	xy	6
cc	xy	xy	xy	0
cc	xy	xy	xy	2
dd	xy	xy	xy	0
dd	xy	xy	xy	2

col_a	col__b	col_x	col_y	col_z
aa	xy	xy	xy	2
aa	xy	xy	xy	3
bb	xy	xy	xy	14
bb	xy	xy	xy	9
bb	xy	xy	xy	6
cc	xy	xy	xy	0
cc	xy	xy	xy	2
dd	xy	xy	xy	0
dd	xy	xy	xy	2

col_a	col__b	col_x	col_y	col_z
aa	xy	xy	xy	2
aa	xy	xy	xy	3
bb	xy	xy	xy	14
bb	xy	xy	xy	9
bb	xy	xy	xy	6
cc	xy	xy	xy	0
cc	xy	xy	xy	2
dd	xy	xy	xy	0
dd	xy	xy	xy	2