Reputation: 597
I have 2 dataframe
for ex:
df1:
seq_id1 seq_id2
seq1_A seq2_B
seq2_A seq3_B
seq4_A seq9_B
seq9_A seq9_B
etc
and another dataframe such
df2:
sequences
seq2_A
seq9_A
and keep only in the first dataframe, the row where the ID in the dataframe is present, here it would be:
newdataframe merged:
seq_id1 seq_id2
seq2_A seq3_B
seq9_A seq9_B
Thanks for your help :)
here. are the dataframe=
First one with only 60 rows : df1
second one with with all seq ID: df2
Here the columns ["#qseqid'"]
in the first df has to match with the restricted df2 in the column ["seq2_id"]
Upvotes: 1
Views: 53
Reputation: 862406
I believe need for match column seq_id1
with df2['sequences']
use isin
with boolean indexing
:
df1[df1['seq_id1'].isin(df2['sequences'])]
Or:
df = pd.merge(df1, df2, left_on='seq_id1', right_on='sequences')
If need match both columns of df1
:
df1[df1.isin(df2['sequences']).any(axis=1)]
Upvotes: 3