Tom
Tom

Reputation: 21

Pandas filter rows based on values from 2 columns in another dataframe

Assume we have 2 dfs.

df1:

        Product   Name1    Name2
    0   ABC       aaa      bbb
    1   ASD       sss      ddd
    2   ZCX       xxx      NaN
    3   QWE       qqq      NaN
    4   IOP       iii      ooo

df2:

        Title     Col1     Col2
    0   aaaLKJ    ...      ...
    1   zzzMNB    ...      ...
    2   dddKJH    ...      ...
    3   ooo123    ...      ...
    4   aaa098    ...      ...
    5   ttt676    ...      ...

The output df should consists all of the df2 rows, in which the 'Title' starts with either 'Name1' or 'Name2' of df1.

        Title     Col1     Col2
    0   aaaLKJ    ...      ...
    2   dddKJH    ...      ...
    3   ooo123    ...      ...
    4   aaa098    ...      ...

Please let me know how this could be achieved. I have tried several ways with df.loc, isin, str.contains, but without luck ...

Upvotes: 2

Views: 68

Answers (1)

yatu
yatu

Reputation: 88305

You could build a list with all names from both columns in df1, and use Series.str.startswith on Title in df2, which also works with a tuple of strings, to find those which start with any term in names:

names = df1.filter(like='Name').stack().to_numpy()
# array(['aaa', 'bbb', 'sss', 'ddd', 'xxx', 'qqq', 'iii', 'ooo'], dtype=object)
df2 = df2[df2.Title.str.startswith(tuple(names))]

print(df2)
    Title Col1 Col2
0  aaaLKJ  ...  ...
2  dddKJH  ...  ...
3  ooo123  ...  ...
4  aaa098  ...  ...

Upvotes: 2

Related Questions