Reputation: 580
I have a dataframe like this,
Key Group RS
42 23 100
42 41 21
46 23 89
67 10 65
Other dataframe is like this,
Key Group RS
42 41 11
67 10 23
There is hierarchy: Key -> Group(unique within key) -> RS.
I wanted to replace the RS value in first dataframe if the Key and Group are same with second dataframe. So result should be like,
Key Group RS
42 23 100
42 41 11 // updated
46 23 89
67 10 23 // updated
I wanted to use the pandas .update
function but it won't work with non-unique indexes, which is the case for Key as it repeats in multiple rows. I am confused on what to do.
Upvotes: 1
Views: 87
Reputation: 3316
One workaround you can consider is updating your index (key
) to be unique, then using the update method you mentioned. Another way might be to iterate through both arrays, updating as you go, but that's not as elegant.
Another thing you could try is using multi-index to recover uniqueness.
Upvotes: 0
Reputation: 323226
pd.concat
+drop_duplicates
pd.concat([df1,df2],0).drop_duplicates(['Key','Group'],keep='last')
Out[1107]:
Key Group RS
0 42 23 100
2 46 23 89
0 42 41 11
1 67 10 23
Upvotes: 4
Reputation: 76917
Here's one way
In [718]: cols = ['Key', 'Group']
In [719]: df2.set_index(cols).combine_first(df1.set_index(cols)).reset_index()
Out[719]:
Key Group RS
0 42 23 100.0
1 42 41 11.0
2 46 23 89.0
3 67 10 23.0
Details
In [720]: df1
Out[720]:
Key Group RS
0 42 23 100
1 42 41 21
2 46 23 89
3 67 10 65
In [721]: df2
Out[721]:
Key Group RS
0 42 41 11
1 67 10 23
Upvotes: 3
Reputation: 294228
Set indices and use loc
to assign updated values. Then reset the index.
Note that this approach preserves the dtype
of the 'RS'
column.
d1 = df1.set_index(cols).RS
d2 = df2.set_index(cols).RS
d1.loc[d2.index] = d2
d1.reset_index()
Key Group RS
0 42 23 100
1 42 41 11
2 46 23 89
3 67 10 23
Upvotes: 4