Django0602
Django0602

Reputation: 807

Pandas search replace using two datasets

I have a situation where I need to search replace values using two datasets basis on the first column or pandas default index. My df1 :

  id1  id2   id3   id4
0   1    2     3    4
1   3    4     5    6

My df2

  id_df2
0    500
1   1000
2   2000
3   3000
4   4000
5   5000
6   6000 

The final df should look like :

         id1     id2      id3     id4
    0    1000    2000     3000    4000
    1    3000    4000     5000    5000

The search has to happen using the pandas default index column and then the values from the df2 should be replaced in the df1.

How to achieve this?

Upvotes: 1

Views: 62

Answers (1)

jezrael
jezrael

Reputation: 862731

Use DataFrame.replace by Series created be select:

df = df1.replace(df2['id_df2'])
print (df)
    id1   id2   id3   id4
0  1000  2000  3000  4000
1  3000  4000  5000  6000

If exist all values and default RangeIndex in df2 you can improve performance by indexing in numpy and then create new DataFrame by constructor:

arr = df2['id_df2'].to_numpy()[df1.to_numpy()]
print (arr)
[[1000 2000 3000 4000]
 [3000 4000 5000 6000]]

df = pd.DataFrame(arr, index=df1.index, columns=df1.columns)
print (df)
    id1   id2   id3   id4
0  1000  2000  3000  4000
1  3000  4000  5000  6000

Upvotes: 2

Related Questions