thxclown
thxclown

Reputation: 37

Match entire list with values from another dataframe

I have a dataframe with a list in one column and want to match all items in this list with a second dataframe. The matched values should then be added (as a list) to a new column in the first dataframe.

data = {'froots':  [['apple','banana'], ['apple','strawberry']]
        }
df1 = pd.DataFrame(data)

data = {'froot':  ['apple','banana','strawberry'],
        'age': [2,3,5]
        }
df2 = pd.DataFrame(data)

DF1

index fruits
1     ['apple','banana']
2     ['apple','strawberry']

DF2

index fruit age
1     apple 2
2     banana 3
3     strawberry 5

New DF1

index froots                  age
1     ['apple','banana']      [2,3]
2     ['apple','strawberry']  [2,5]

I have a simple solution that takes way too long:

age = list()
for index,row in df1.iterrows():
    numbers = row.froots
    tmp = df2[['froot','age']].apply(lambda x: x['age'] if x['froot'] in numbers else None, axis=1).dropna().tolist()
    age.append(tmp)
df1['age'] = age

Is there maybe a faster solution to this problem? Thanks in Advance!

Upvotes: 1

Views: 580

Answers (1)

jezrael
jezrael

Reputation: 862511

Use lsit comprehension with dictionary created by df2 and add new values to list if exist in dictionary tested by if:

d = df2.set_index('froot')['age'].to_dict()

df1['ag1e'] = df1['froots'].apply(lambda x: [d[y] for y in x if y in d])
print (df1)
                froots    ag1e
0      [apple, banana]  [2, 3]
1  [apple, strawberry]  [2, 5]

Upvotes: 2

Related Questions