user13744439
user13744439

Reputation: 142

Setting multiple column at once give error "Not in index error!"

import pandas as pd
df = pd.DataFrame(
    [
        [5, 2],
        [3, 5],
        [5, 5],
        [8, 9],
        [90, 55]
    ],
    columns = ['max_speed', 'shield']
)

df.loc[(df.max_speed > df.shield), ['stat', 'delta']] \
    = 'overspeed', df['max_speed'] - df['shield']

I am setting multiple column using .loc as above, for some cases I get Not in index error!. Am I doing something wrong above?

Upvotes: 3

Views: 59

Answers (1)

jezrael
jezrael

Reputation: 863146

Create list of tuples by same size like number of Trues with filtered Series after subtract with repeat scalar overspeed:

m = (df.max_speed > df.shield)
s = df['max_speed'] - df['shield']

df.loc[m, ['stat', 'delta']] = list(zip(['overspeed'] * m.sum(), s[m]))
print(df)
   max_speed  shield       stat  delta
0          5       2  overspeed    3.0
1          3       5        NaN    NaN
2          5       5        NaN    NaN
3          8       9        NaN    NaN
4         90      55  overspeed   35.0

Another idea with helper DataFrame:

df.loc[m, ['stat', 'delta']] = pd.DataFrame({'stat':'overspeed', 'delta':s})[m]

Details:

print(list(zip(['overspeed'] * m.sum(), s[m])))
[('overspeed', 3), ('overspeed', 35)]

print (pd.DataFrame({'stat':'overspeed', 'delta':s})[m])
        stat  delta
0  overspeed      3
4  overspeed     35

Simpliest is assign separately:

df.loc[m, 'stat'] = 'overspeed'
df.loc[m, 'delta'] = df['max_speed'] - df['shield']

print(df)
   max_speed  shield       stat  delta
0          5       2  overspeed    3.0
1          3       5        NaN    NaN
2          5       5        NaN    NaN
3          8       9        NaN    NaN
4         90      55  overspeed   35.0

Upvotes: 1

Related Questions