M D
M D

Reputation: 77

Pandas lambda function raised an indexing error

I have a data frame df and would like to reassign value from columns b to the last columns. The logic is as follows: if "b" column value is greater or equal to the previous row of "a" column value, reassign "b" value as "green", otherwise "red". My code raise an indexing error: Too many indexers. Have no idea what's wrong with my code. Any help would be appreciated.

value = [[10, 95, 10, 32],[22, 12, 3, 15],[28, 25, 5, 29],[30, 11, 66, 16]]
df = pd.DataFrame(value, columns=['a', 'b', 'c', 'd'])

for j in range(2, len(df.columns)):
    df.iloc[:,j] = df.apply(lambda x: "green" if x.iloc[:,j] >= (x["b"].shift(periods = 1)) else "red", axis = 1)

The expected result is:

 a     b     c     d

10   nan   nan   nan
22  green  red   green
28  green  red   green
30  red   green  red

Upvotes: 0

Views: 419

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35646

Let's try with np.where and compare where the columns are Series.ge than the shift of a:

import numpy as np

df.iloc[:, 1:] = np.where(df.iloc[:, 1:].ge(df['a'].shift(), axis=0),
                          'green',
                          'red')
df.iloc[0, 1:] = np.nan

df:

    a      b      c      d
0  10    NaN    NaN    NaN
1  22  green    red  green
2  28  green    red  green
3  30    red  green    red

The second assignment to put the nans back in the first row is necessary as nan >= value is False so the first row will end up all red.


It may be beneficial to add the NaN values back based on where the the shifted series is NaN (this allows for different types of shifting, but will not change the output from the above)

s = df['a'].shift()
df.iloc[:, 1:] = np.where(df.iloc[:, 1:].ge(s, axis=0), 'green', 'red')
df.iloc[s.isna(), 1:] = np.nan

A (slower) option with apply + map:

s = df['a'].shift()
df.iloc[:, 1:] = df.iloc[:, 1:].apply(
    lambda x: x.ge(s).map({True: 'green', False: 'red'})
)
df.iloc[s.isna(), 1:] = np.nan

Or apply + np.where:

s = df['a'].shift()
df.iloc[:, 1:] = df.iloc[:, 1:].apply(
    lambda x: np.where(x.ge(s), 'green', 'red')
)
df.iloc[s.isna(), 1:] = np.nan

Upvotes: 1

Related Questions