lulu mirzai
lulu mirzai

Reputation: 77

pandas: add corresponding value to the second dataframe if column names matches the cell value in the second dataframe

I have two dataframes like the following but with more rows:

import pandas as pd

text1 = {'first_text': ['she is cool', 'they are nice', 'he is good', 'we are friendly'],
         'change_adj': ['she is neat', 'NaN', 'NaN', 'we are nice'],
         'change_pro': ['NaN', 'she is nice', 'NaN', 'she is friendly'],
         'change_verb': ['she was cool', 'they were nice', 'he was good', 'NaN'], }

df1 = pd.DataFrame(text1, columns=['first_text', 'change_adj', 'change_pro', 'change_verb'])

text2 = {
    'Domain': ['change_adj', 'change_pro', 'change_verb', 'change_adj', 'change_pro', 'change_verb', 'change_verb'],
    'info': ['she is neat', 'she is nice', 'she was cool', 'we are nice', 'she is friendly', 'they were nice',
             'he was good']}

df2 = pd.DataFrame(text2, columns=['Domain', 'info'])

so essentially the second dataframe is the stacked version of the first one minus the 'first_text' column. what I would like to do is to add the 'first_text' column to the second dataframe in a way that the sentences from the 'first_text' matches the info column in the second dataframe, like the following:

desired output:

            first_text       Domain              info

 0        she is cool     change_adj        she is neat
 1      they are nice     change_pro        she is nice
 2        she is cool     change_verb       she was cool
 3      we are friendly   change_adj        we are nice
 4      we are friendly   change_pro        she is friendly
 5      they are nice     change_verb       they were nice
 6      he is good        change_verb       he was good

Upvotes: 0

Views: 104

Answers (2)

sushanth
sushanth

Reputation: 8302

You can use pandas.melt in combination with pandas.merge

melt = df1.melt(id_vars='first_text', var_name="Domain", value_name="info")

df2.merge(melt, on=['Domain', 'info'], how='left')

        Domain             info       first_text
0   change_adj      she is neat      she is cool
1   change_pro      she is nice    they are nice
2  change_verb     she was cool      she is cool
3   change_adj      we are nice  we are friendly
4   change_pro  she is friendly  we are friendly
5  change_verb   they were nice    they are nice
6  change_verb      he was good       he is good

Upvotes: 3

Chris
Chris

Reputation: 29732

One way using pandas.DataFrame.query with itertuples:

res = []
for x, y in df2.itertuples(False, None):
    res.append(df1.query("%s == '%s'" % (x, y))["first_text"].iloc[0])    
df2["first_text"] = res
print(df2)

Output:

        Domain             info       first_text
0   change_adj      she is neat      she is cool
1   change_pro      she is nice    they are nice
2  change_verb     she was cool      she is cool
3   change_adj      we are nice  we are friendly
4   change_pro  she is friendly  we are friendly
5  change_verb   they were nice    they are nice
6  change_verb      he was good       he is good

Upvotes: 1

Related Questions