Frank AK
Frank AK

Reputation: 1781

How to update pandas's column if they have the same columns's value?

Let's say, I have two original DataFrames like this:

df1 = pd.DataFrame({"ID": [101, 102, 103], "Price":[12, 33, 44], "something":[12,22,11]})
df2 = pd.DataFrame({"ID": [101, 103], "Price":[122, 133]})

And it displays like this:

    ID  Price  something
0  101     12          12
1  102     33          22
2  103     44          11

And

    ID  Price
0  101    122
1  103    133

Since I don't set any indices for any column, I want to know how can I update the df1 if both DataFrames have the same ID. For this sample, I hope I can get the result like this:

    ID  Price  something
0  101     122          12
1  102     33           22
2  103     133          11

You can see, I only care about the price column. Here is what I have tried for now:

pd.concat([df1,df2]).drop_duplicates(['ID'],keep='last') 

But it just shows me:

    ID  Price  something
1  102     33        22.0
0  101    122         NaN
1  103    133         NaN

I don't want any other column values to be changed.

I'd like to keep the order of the rows of df1.

UPDATE

After running the answer code, and I found the order of the columns will change, since we are using reset_index, something about index. so I hope someone can point me out how to keep the original position of my DataFrame. For now, it looks like below:

In [180]: df1 = pd.DataFrame({"ss":[12,22,11], "ID": [101, 102, 103], "Price":[12, 33, 44], "something":[12,22,11]}) 
     ...: df2 = pd.DataFrame({"ID": [101, 103], "Price":[122, 133]}) 

                                                                                                                                                                     

In [181]: df1.set_index('ID',inplace=True) 
     ...: df1.update(df2.set_index('ID')) 
     ...: df1.reset_index(inplace=True)                                                                                                                                                                                                       

In [182]: df1                                                                                                                                                                                                                                 
Out[182]: 
    ID  ss  Price  something
0  101  12  122.0         12
1  102  22   33.0         22
2  103  11  133.0         11

Upvotes: 3

Views: 1141

Answers (2)

Loochie
Loochie

Reputation: 2472

Another possible solution could be using combine_first()

df2.set_index(['ID']).combine_first(df1.set_index(['ID', 'something'])).reset_index()

And also by using isin()

df1.loc[df1.ID.isin(df2.ID), ['Price']] = df2[['Price']].values

Upvotes: 1

BENY
BENY

Reputation: 323236

Using np.where and isin update your price in df1 after merge

df1.Price=np.where(df1.ID.isin(df2.ID),df1.merge(df2,on='ID',how='left')['Price_y'],df1.Price)

df1

    ID  Price  something
0  101  122.0          12
1  102   33.0          22
2  103  133.0          11

Using update:

df1.set_index('ID',inplace=True)
df1.update(df2.set_index('ID'))
df1.reset_index(inplace=True)

df1 
    ID  Price  something
0  101  122.0          12
1  102   33.0          22
2  103  133.0          11

Upvotes: 4

Related Questions