Reputation: 77
I have two dataframes like the following but with more rows:
import pandas as pd
text1 = {'first_text': ['she is cool', 'they are nice', 'he is good', 'we are friendly'],
'change_adj': ['she is neat', 'NaN', 'NaN', 'we are nice'],
'change_pro': ['NaN', 'she is nice', 'NaN', 'she is friendly'],
'change_verb': ['she was cool', 'they were nice', 'he was good', 'NaN'], }
df1 = pd.DataFrame(text1, columns=['first_text', 'change_adj', 'change_pro', 'change_verb'])
text2 = {
'Domain': ['change_adj', 'change_pro', 'change_verb', 'change_adj', 'change_pro', 'change_verb', 'change_verb'],
'info': ['she is neat', 'she is nice', 'she was cool', 'we are nice', 'she is friendly', 'they were nice',
'he was good']}
df2 = pd.DataFrame(text2, columns=['Domain', 'info'])
so essentially the second dataframe is the stacked version of the first one minus the 'first_text' column. what I would like to do is to add the 'first_text' column to the second dataframe in a way that the sentences from the 'first_text' matches the info column in the second dataframe, like the following:
desired output:
first_text Domain info
0 she is cool change_adj she is neat
1 they are nice change_pro she is nice
2 she is cool change_verb she was cool
3 we are friendly change_adj we are nice
4 we are friendly change_pro she is friendly
5 they are nice change_verb they were nice
6 he is good change_verb he was good
Upvotes: 0
Views: 104
Reputation: 8302
You can use pandas.melt in combination with pandas.merge
melt = df1.melt(id_vars='first_text', var_name="Domain", value_name="info")
df2.merge(melt, on=['Domain', 'info'], how='left')
Domain info first_text
0 change_adj she is neat she is cool
1 change_pro she is nice they are nice
2 change_verb she was cool she is cool
3 change_adj we are nice we are friendly
4 change_pro she is friendly we are friendly
5 change_verb they were nice they are nice
6 change_verb he was good he is good
Upvotes: 3
Reputation: 29732
One way using pandas.DataFrame.query
with itertuples
:
res = []
for x, y in df2.itertuples(False, None):
res.append(df1.query("%s == '%s'" % (x, y))["first_text"].iloc[0])
df2["first_text"] = res
print(df2)
Output:
Domain info first_text
0 change_adj she is neat she is cool
1 change_pro she is nice they are nice
2 change_verb she was cool she is cool
3 change_adj we are nice we are friendly
4 change_pro she is friendly we are friendly
5 change_verb they were nice they are nice
6 change_verb he was good he is good
Upvotes: 1