Glenn
Glenn

Reputation: 141

using apply and lambda to iterate through a dataframe collecting values

My example is made up. I'd like to figure it out with apply() and lambda, though I've tried iterrows() too with no luck. I am trying to add a column to df2, that looks up values in df1, based on the 'item' combinations in each df2 row. Thanks in advance for your help.

import pandas as pd
import numpy as np
import random

names= ['A', 'B', 'C', 'D', 'E']

df1 = pd.DataFrame( np.arange(25).reshape(5,5), columns = names, index = names)

n=5
data = {'Item 1' : random.sample(names, n),
        'Item 2' : random.sample(names, n)}
df2 = pd.DataFrame(data)

#I can't get this to work. 
df2['New'] = df2.apply(lambda x: df1.loc[df2.loc[x, 'Item 1'], df2.loc[x, 'Item 2']], axis=1)

#Since this works, I assume my error with apply and lambda.  Thanks.
x=2
df1.loc[df2.loc[x, 'Item 1'], df2.loc[x, 'Item 2']]

enter image description here

enter image description here

Upvotes: 1

Views: 3143

Answers (2)

rafaelc
rafaelc

Reputation: 59274

I would avoid using apply in general, and specifically having a loc call inside a lambda function. This will get very slow with time.

Use numpy's vectorization instead:

r = df2['Item 1'].map(dict(zip(df1.index, np.arange(len(df1.index)))))
c = df2['Item 2'].map(dict(zip(df1.columns, np.arange(len(df1.columns)))))

df2['new'] = df1.to_numpy()[r, c]

Upvotes: 2

Behzad Shayegh
Behzad Shayegh

Reputation: 333

df2['new'] = df2.apply(lambda x: df1.loc[x['Item 1'],x['Item 2']], axis=1)

output:

>>> df2
  Item 1 Item 2  new
0      D      A   15
1      B      B    6
2      A      D    3
3      E      C   22
4      C      E   14

Is that what you want? If its not, please add a sample output you want to see.

Upvotes: 2

Related Questions