user16600616
user16600616

Reputation:

Comparing two data frames and creating new data frame based on some conditions

I have two data frame and I need to extract some data based on other data frame. I tried to do it but in some point I couldn't go further. I'm working on pandas.

For example, main data frame is like that: df1

fruit
apple
banana
grapes
blueberry
cherry

And I have another data frame which is like that: df2

subdata
nan
pple
che
lsls
lueberry
cherry
app

Desired output:

newdata
nan
pple
che
lueberry
cherry
app

so here, I want to extract the data, if df2 is in df1. For example, in df2 first row I have nan and this is in banana, or it also can match. Like in df2 last row I have cherry and it also match with df1 cherry. "lsls" in df2 4th row is not matching with anything in df1 and it wont be extracted. So I want to extract all matching or submatching with df1 and extract those data from df2. I tried to some codes but couldnt do properly. If you help me, I'll be pleased. Thank you.

Upvotes: 0

Views: 31

Answers (1)

Epsi95
Epsi95

Reputation: 9047

You can try apply for simplicity

import pandas as pd
import numpy as np

df1 = pd.DataFrame(['apple','banana','grapes','blueberry','cherry'], columns=['fruit'])
df2 = pd.DataFrame(['nan','pple','che','lsls','lueberry','cherry'], columns=['subdata'])


df3 = df2[df2['subdata'].apply(lambda x: df1['fruit'].str.contains(x).any())]

df3

#   subdata
# 0 nan
# 1 pple
# 2 che
# 4 lueberry
# 5 cherry

Upvotes: 1

Related Questions