reservoirinvest
reservoirinvest

Reputation: 1777

Lookup into dataframe from another with iloc

Have a dataframe with location and column for lookup as follows:

import pandas as pd
import numpy as np

i = ['dog', 'cat', 'bird', 'donkey'] * 100000

df1 = pd.DataFrame(np.random.randint(1, high=380, size=len(i)), 
                  ['cat', 'bird', 'donkey', 'dog'] * 100000).reset_index()
df1.columns = ['animal', 'locn']
df1.head()

The dataframe to be looked at is as follows:

df = pd.DataFrame(np.random.randn(len(i), 2), index=i,
     columns=list('AB')).rename_axis('animal').sort_index(0).reset_index()
df

Looking for a faster way to assign a column with value of B, for every record in df1.

df1.assign(val=[df[df.animal == a].iloc[b].B for a, b in zip(df1.animal, df1['locn'])])

...is pretty slow.

Upvotes: 2

Views: 39

Answers (1)

jezrael
jezrael

Reputation: 862731

Use GroupBy.cumcount for counter of animal column for positions, so possible use merge with left join:

df['locn'] = df.groupby('animal').cumcount()
df1['new'] = df1.merge(df.reset_index(), on=['animal','locn'], how='left')['B']

Verfify in smaller dataframe:

np.random.seed(2019)
i = ['dog', 'cat', 'bird', 'donkey'] *  100

df1 = pd.DataFrame(np.random.randint(1, high=10, size=len(i)), 
                  ['cat', 'bird', 'donkey', 'dog'] * 100).reset_index()
df1.columns = ['animal', 'locn']
print (df1)

df = pd.DataFrame(np.random.randn(len(i), 2), index=i,
     columns=list('AB')).rename_axis('animal').sort_index(0).reset_index()

df1 = df1.assign(val=[df[df.animal == a].iloc[b].B for a, b in zip(df1.animal, df1['locn'])])

df['locn'] = df.groupby('animal').cumcount()
df1['new'] = df1.merge(df.reset_index(), on=['animal','locn'], how='left')['B']

locn = df.groupby('animal').cumcount()
df1 = df1.assign(new1 = df1.merge(df.reset_index().assign(locn = locn), 
                                  on=['animal','locn'], how='left')['B'])

print (df1.head(10))
   animal  locn       val       new      new1
0     cat     9 -0.535465 -0.535465 -0.535465
1    bird     3  0.296240  0.296240  0.296240
2  donkey     6  0.222638  0.222638  0.222638
3     dog     9  1.115175  1.115175  1.115175
4     cat     7  0.608889  0.608889  0.608889
5    bird     9 -0.025648 -0.025648 -0.025648
6  donkey     1  0.324736  0.324736  0.324736
7     dog     1  0.533579  0.533579  0.533579
8     cat     8 -1.818238 -1.818238 -1.818238
9    bird     9 -0.025648 -0.025648 -0.025648

Upvotes: 2

Related Questions