Trion
Trion

Reputation: 580

Updating a column from 1 Dataframe where other Dataframe has same keys

I have a dataframe like this,

Key Group RS
42 23 100
42 41 21
46 23 89
67 10 65

Other dataframe is like this,

Key Group RS
42 41 11
67 10 23

There is hierarchy: Key -> Group(unique within key) -> RS.

I wanted to replace the RS value in first dataframe if the Key and Group are same with second dataframe. So result should be like,

Key Group RS
42 23 100
42 41 11 // updated
46 23 89
67 10 23 // updated

I wanted to use the pandas .update function but it won't work with non-unique indexes, which is the case for Key as it repeats in multiple rows. I am confused on what to do.

Upvotes: 1

Views: 87

Answers (4)

Charles
Charles

Reputation: 3316

One workaround you can consider is updating your index (key) to be unique, then using the update method you mentioned. Another way might be to iterate through both arrays, updating as you go, but that's not as elegant.

Another thing you could try is using multi-index to recover uniqueness.

Upvotes: 0

BENY
BENY

Reputation: 323226

pd.concat +drop_duplicates

pd.concat([df1,df2],0).drop_duplicates(['Key','Group'],keep='last')
Out[1107]: 
   Key  Group   RS
0   42     23  100
2   46     23   89
0   42     41   11
1   67     10   23

Upvotes: 4

Zero
Zero

Reputation: 76917

Here's one way

In [718]: cols = ['Key', 'Group']

In [719]: df2.set_index(cols).combine_first(df1.set_index(cols)).reset_index()
Out[719]:
   Key  Group     RS
0   42     23  100.0
1   42     41   11.0
2   46     23   89.0
3   67     10   23.0

Details

In [720]: df1
Out[720]:
   Key  Group   RS
0   42     23  100
1   42     41   21
2   46     23   89
3   67     10   65

In [721]: df2
Out[721]:
   Key  Group  RS
0   42     41  11
1   67     10  23

Upvotes: 3

piRSquared
piRSquared

Reputation: 294228

Set indices and use loc to assign updated values. Then reset the index.
Note that this approach preserves the dtype of the 'RS' column.

d1 = df1.set_index(cols).RS
d2 = df2.set_index(cols).RS
d1.loc[d2.index] = d2
d1.reset_index()

   Key  Group   RS
0   42     23  100
1   42     41   11
2   46     23   89
3   67     10   23

Upvotes: 4

Related Questions