Reputation: 315
I have df:
arrays = [np.array(['bar', 'bar', 'bar', 'bar', 'bar', 'bar', 'baz', 'baz', 'baz', 'baz', 'baz', 'baz', 'foo', 'foo', 'foo', 'foo', 'foo', 'foo']),
np.array(['one','two'] * 9),
np.array([1,2,3] * 6)]
df = pd.DataFrame(np.random.randn(18,2), index=arrays)
col0 col1
bar one 1 0.872359 -1.115871
two 2 -0.937908 -0.528563
one 3 -0.118874 0.286595
two 1 -0.507698 1.364643
one 2 1.507611 1.379498
two 3 -1.398019 -1.603056
baz one 1 1.498263 0.412380
two 2 -0.930022 -1.483657
one 3 -0.438157 1.465089
two 1 0.161887 1.346587
one 2 0.167086 1.246322
two 3 0.276344 -1.206415
foo one 1 -0.045389 -0.759927
two 2 0.087999 -0.435753
one 3 -0.232054 -2.221466
two 1 -1.299483 1.697065
one 2 0.612211 -1.076738
two 3 -1.482573 0.907826
And now I want to create 'NEW' column that:
for 'bar'
if index.level(2) > 1
"NEW" = col1
else
"NEW" = col2
for 'baz' the same with >2
for 'foo' the same with >3
How to do it without Py loops?
Upvotes: 2
Views: 64
Reputation: 862441
You can use get_level_values
for select index values by levels and then for new column numpy.where
:
#if possible use dictionary
d = {'bar':1, 'baz':2, 'foo':3}
m = df.index.get_level_values(2) > df.rename(d).index.get_level_values(0)
df['NEW'] = np.where(m, df.col1, df.col2)
For more general solution useSeries.rank
:
a = df.index.get_level_values(2)
b = df.index.get_level_values(0).to_series().rank(method='dense')
df['NEW'] = np.where(a > b, df.col1, df.col2)
Detail:
print (b)
bar 1.0
bar 1.0
bar 1.0
bar 1.0
bar 1.0
bar 1.0
baz 2.0
baz 2.0
baz 2.0
baz 2.0
baz 2.0
baz 2.0
foo 3.0
foo 3.0
foo 3.0
foo 3.0
foo 3.0
foo 3.0
dtype: float64
Upvotes: 2
Reputation: 95873
I think you can avoid looping over df.level(0) by using df.factorize
:
In [40]: thresh = pd.factorize(df.index.get_level_values(0).values)[0] + 1
In [41]: mask = df.index.get_level_values(2) > thresh
In [42]: df['NEW'] = np.where(mask, df.col1, df.col2)
In [43]: df
Out[43]:
col1 col2 NEW
bar one 1 0.247222 -0.270104 -0.270104
two 2 0.429196 -1.385352 0.429196
one 3 0.782293 -1.565623 0.782293
two 1 0.392214 1.023960 1.023960
one 2 -1.628410 -0.484275 -1.628410
two 3 0.256757 0.529373 0.256757
baz one 1 -0.568608 -0.776577 -0.776577
two 2 2.142408 -0.815413 -0.815413
one 3 0.860080 0.501965 0.860080
two 1 -0.267029 -0.025360 -0.025360
one 2 0.187145 -0.063436 -0.063436
two 3 0.351296 -2.050649 0.351296
foo one 1 0.704941 0.176698 0.176698
two 2 -0.380353 1.027745 1.027745
one 3 -1.337364 -0.568359 -0.568359
two 1 -0.588601 -0.800426 -0.800426
one 2 1.513358 -0.616237 -0.616237
two 3 0.244831 1.027109 1.027109
Upvotes: 1