Reputation: 8090
I'm struggling with a MultiIndex dataframe (a
) which requires the column x
to be set by b
which isn't a MultiIndex and has only 1 index level (first level of a
). I have an index to change those values (ix
), which is why I am using .loc[]
for indexing. The problem is that the way missing index levels are populated in a
is not what I require (see example).
>>> a = pd.DataFrame({'a': [1, 2, 3], 'b': ['b', 'b', 'b'], 'x': [4, 5, 6]}).set_index(['a', 'b'])
>>> a
x
a b
1 b 4
2 b 5
3 b 6
>>> b = pd.DataFrame({'a': [1, 4], 'x': [9, 10]}).set_index('a')
>>> b
x
a
1 9
4 10
>>> ix = a.index[[0, 1]]
>>> ix
MultiIndex(levels=[[1, 2, 3], [u'b']],
codes=[[0, 1], [0, 0]],
names=[u'a', u'b'])
>>> a.loc[ix]
x
a b
1 b 4
2 b 5
>>> a.loc[ix, 'x'] = b['x']
>>> # wrong result (at least not what I want)
>>> a
x
a b
1 b NaN
2 b NaN
3 b 6.0
>>> # expected result
>>> a
x
a b
1 b 9 # index: a=1 is part of DataFrame b
2 b 5 # other indices don't exist in b and...
3 b 6 # ... x-values remain unchanged
# if there were more [1, ...] indices...
# ...x would also bet set to 9
Upvotes: 2
Views: 524
Reputation: 694
I think you want to merge a and B. you should consider using concat,merge or join funcs.
Upvotes: 1
Reputation: 12406
I first reset the multi-index of a
and then I set it to the (single column) a
a = a.reset_index()
a = a.set_index('a')
print(a)
b x
a
1 b 4
2 b 5
3 b 6
print(b)
x
a
1 9
4 10
Then, make the assignment you require using loc
and also re-set the multi-index
loc
, your ix = a.index[[0, 1]]
becomes similar to [1,0]
(1
refers to index of a
and 0
refers to index of b
)a.loc[1, 'x'] = b.iloc[0,0]
a.reset_index(inplace=True)
a = a.set_index(['a','b'])
print(a)
x
a b
1 b 9
2 b 5
3 b 6
EDIT:
Alternatively, reset the multi-index of a
and don't set it to a single column index. Then your [0,1]
(referring to index values with loc
, not positions iloc
) can be used (0
refers to index of a
and 1
refers to index of b
)
a = a.reset_index()
print(a)
a b x
0 1 b 4
1 2 b 5
2 3 b 6
a.loc[0, 'x'] = b.loc[1,'x']
a = a.set_index(['a','b'])
print(a)
x
a b
1 b 9
2 b 5
3 b 6
Upvotes: 0
Reputation: 1545
You try use 1- index frame with 2- index frame, just use values
:
EDIT:
import pandas as pd
a = pd.DataFrame({'a': [1, 2, 3], 'b': ['b', 'b', 'b'], 'x': [4, 5, 6]}).set_index(['a', 'b'])
b = pd.DataFrame({'a': [1, 4], 'x': [9, 10]}).set_index('a')
a_ix = a.index.get_level_values('a')[[0, 1]]
b_ix = b.index
mask = (b_ix == a_ix)
a.loc[mask, 'x'] = b.loc[mask,'x'].values
a:
x
a b
1 b 9
2 b 5
3 b 6
Upvotes: 0
Reputation: 150735
I can't think of any one-liner, so here's a multi-step approach:
tmp_df = a.loc[ix, ['x']].reset_index(level=1, drop=True)
tmp_df['x'] = b['x']
tmp_df.index = ix
a.loc[ix, 'x'] = tmp_df['x']
Output:
x
a b
1 b 9.0
2 b 5.0
3 b 6.0
Edit: I assume that the b
's in index are symbolic. Otherwise, the code will fail from a.loc[ix, 'x']
: for
a = pd.DataFrame({'a': [1, 1, 2, 3],
'b': ['b', 'b', 'b', 'b'],
'x': [4, 5, 3, 6]}).set_index(['a', 'b'])
a.loc[ix,'x']
gives:
a b
1 b 4
b 5
b 4
b 5
Name: x, dtype: int64
Upvotes: 0