Reputation:
I have two data frame and I need to extract some data based on other data frame. I tried to do it but in some point I couldn't go further. I'm working on pandas.
For example, main data frame is like that: df1
fruit |
---|
apple |
banana |
grapes |
blueberry |
cherry |
And I have another data frame which is like that: df2
subdata |
---|
nan |
pple |
che |
lsls |
lueberry |
cherry |
app |
Desired output:
newdata |
---|
nan |
pple |
che |
lueberry |
cherry |
app |
so here, I want to extract the data, if df2 is in df1. For example, in df2 first row I have nan and this is in banana, or it also can match. Like in df2 last row I have cherry and it also match with df1 cherry. "lsls" in df2 4th row is not matching with anything in df1 and it wont be extracted. So I want to extract all matching or submatching with df1 and extract those data from df2. I tried to some codes but couldnt do properly. If you help me, I'll be pleased. Thank you.
Upvotes: 0
Views: 31
Reputation: 9047
You can try apply
for simplicity
import pandas as pd
import numpy as np
df1 = pd.DataFrame(['apple','banana','grapes','blueberry','cherry'], columns=['fruit'])
df2 = pd.DataFrame(['nan','pple','che','lsls','lueberry','cherry'], columns=['subdata'])
df3 = df2[df2['subdata'].apply(lambda x: df1['fruit'].str.contains(x).any())]
df3
# subdata
# 0 nan
# 1 pple
# 2 che
# 4 lueberry
# 5 cherry
Upvotes: 1