user11638654
user11638654

Reputation: 315

'merge' 2 dataframes on elements from list?

I want to do the following merge (hard to describe in words): This are my Dataframes

df8=pd.DataFrame({'names':[['Hans','Meier'],['Debby','Harry','Peter']]})
    names
 0 ['Hans','Meier']
 1 ['Debby','Harry','Peter']

df9=pd.DataFrame({'caller':['Hans','Meier','Debby','Harry','Peter'],'text':[['hi im hans'],['hi im meier'],['hi im debby'],['hi im harry'],['hi im peter']]})
df9.set_index(df9.caller, inplace = True)
df9.drop('caller', axis = 1, inplace = True)

 caller     text
 Hans        ['hi im hans']
 Meier       ['hi im meier']
 .
 .
 .

The result should look like this

      names                  content
0 ['Hans','Meier']          ['hi im hans', 'hi im meier']
1 ['Debby','Harry','Peter'] ['hi im debby', 'hi im harry', 'hi im peter']

So that the texts said by the persons in df9 will appear in df8 if the person is an element of the respective names list.

i think it is a similar question to this but i dont see a solution there

i looked into the pandas documentation about concatenate, join and merge but didnt find there a solution either

Upvotes: 2

Views: 100

Answers (4)

Acccumulation
Acccumulation

Reputation: 3591

df8['content']= df8['names'].apply(lambda x: [df9.loc[name,'text'][0] for name in x])

This return an error if there is a name that isn't found in df9. You can make it more robust with

df8['content']= df8['names'].apply(lambda x: [df9['text'].get(name)[0] if df9['text'].get(name) else None for name in x])

This will have a list that contains the text for every name found, and None for any name not found.

If all you're using df9 for is as a look-up table, then it would be more appropriate to store it as a dictionary, in which case it would be

df8['content']= df8['names'].apply(lambda x: [my_dict.get(name)[0] if my_dict.get(name) else None for name in x])

Upvotes: 0

IanS
IanS

Reputation: 16241

You can lookup the values in df9:

df8['contents'] = df8['names'].apply(lambda l: [df9['text'].loc[name] for name in l])

Upvotes: 5

BENY
BENY

Reputation: 323226

Here is one way

df9['text']=df9['text'].str[0]

l=[df9.loc[x,'text'].tolist() for x in df8.names]
Out[505]: [['hi im hans', 'hi im meier'], ['hi im debby', 'hi im harry', 'hi im peter']]

df9['cont']=l

Upvotes: 6

anky
anky

Reputation: 75080

Using s.get:

d=df9.set_index('caller')['text']
df8=df8.assign(content=df8.names.apply(lambda x:[d.get(i) for i in x]))
print(df8)

                   names                                        content
0          [Hans, Meier]                  [[hi im hans], [hi im meier]]
1  [Debby, Harry, Peter]  [[hi im debby], [hi im harry], [hi im peter]]

Upvotes: 5

Related Questions