Reputation: 11657
I have two dataframes with multi-indexes looking like this:
pd.DataFrame({'observation': {('foo', '2017-04-16'): 'green',
('bar', '2017-04-25'): 'red',
('zap', '2017-04-16'): 'red',
('zip', '2017-04-25'): 'blue',
('zip', '2017-04-16'): 'white'},
'observation': {('zap', '2017-04-16'): np.nan,
('bar', '2017-04-27'): 'white',
('foo', '2017-05-16'): np.nan,
('foo', '2017-04-25'): 'red',
('zip', '2017-08-16'): 'red'}})
pd.DataFrame({'foo': {('00', '08'): '0.0',
('01', '08'): '0.0',
('01', '08'): '0.0',
('00', '08'): '1.0',
('03', '08'): '1.0',
('06', '08'): '0.0',
('00', '08'): '1.0',
('00', '08'): '1.0',
('00', '08'): '0.0',
('02', '08'): '0.0'},
'client_id': {('00', '08'): '1.0',
('01', '08'): '1.0',
('01', '08'): '1.0',
('00', '08'): '1.0',
('03', '08'): '1.0',
('06', '08'): '1.0',
('00', '08'): '1.0',
('00', '08'): '1.0',
('00', '08'): '1.0',
('02', '08'): '1.0'},
'execution_date': {('00', '08'): '2019-01-09',
('01', '08'): '2019-01-09',
('01', '08'): '2019-01-09',
('00', '08'): '2019-01-09',
('03', '08'): '2019-01-09',
('06', '08'): '2019-01-09',
('00', '08'): '2019-01-09',
('00', '08'): '2019-01-09',
('00', '08'): '2019-01-09',
('02', '08'): '2019-01-09'},
'del': {('00', '08'): '0.0',
('01', '08'): '0.0',
('01', '08'): '0.0',
('00', '08'): '0.0',
('03', '08'): '0.0',
('06', '08'): '0.0',
('00', '08'): '0.0',
('00', '08'): '0.0',
('00', '08'): '0.0',
('02', '08'): '0.0'},
'act': {('00', '08'): '11',
('01', '08'): '03',
('01', '08'): '06',
('00', '08'): '07',
('03', '08'): '07',
('06', '08'): '11',
('00', '08'): '28',
('00', '08'): '08',
('00', '08'): '14',
('02', '08'): '26'},
'obs': {('00', '08'): '02',
('01', '08'): '02',
('01', '08'): '02',
('00', '08'): '02',
('03', '08'): '02',
('06', '08'): '02',
('00', '08'): '02',
('00', '08'): '02',
('00', '08'): '02',
('02', '08'): '02'}})
The two are not the same size, and the values don't always overlap, but every index pair found in df1 is in df2. What I would like to do is update the observation col in the df1 with the values of observation
in df2, wherever it matches.
In other words, I would like to do the equivalent of an inner join based on the multi-index, and then overwrite the values in observation
in df1 with those from df2. But is there a way to do this in one step, using loc
/indexing? (This is structured as an index problem, but if there is a way to solve it using reset_index()
that would be fine too.)
Desired output:
obs
00 04 30
08 02
09 16
10 26
16 26
01 01 30
07 16
02 08 02
03 13 26
07 15 26
Upvotes: 0
Views: 232
Reputation: 61910
If I understood correctly, you could do:
df2 = pd.DataFrame({'observation': {('foo', '2017-04-16'): 'green',
('bar', '2017-04-25'): 'red',
('zap', '2017-04-16'): 'red',
('zip', '2017-04-25'): 'blue',
('zip', '2017-04-16'): 'white'},
'observation': {('zap', '2017-04-16'): 'yellow',
('bar', '2017-04-27'): 'white',
('foo', '2017-05-16'): 'black',
('foo', '2017-04-25'): 'red',
('zip', '2017-08-16'): 'red'}})
df['observation'] = df.index.map(dict(zip(df2.index, df2.observation)))
Output
observation
bar 2017-04-27 white
foo 2017-04-25 red
2017-05-16 black
zap 2017-04-16 yellow
zip 2017-08-16 red
Upvotes: 1