Alessandro Ceccarelli
Alessandro Ceccarelli

Reputation: 1945

Numpy Where with more than 2 conditions

Good Morning,

I have the following a dataframe with two columns of integers and a Series (diff) computed as:

diff = (df["col_1"] - df["col_2"]) / (df["col_2"])

I would like to create a column of the dataframe whose values are:

I tried with:

df["Class"] = np.where( (diff >= 0) &  (diff <= 0.35), 0, 
np.where( (diff > 0.35), 1, 
np.where( (diff  < 0) & (diff >=  - 0.35) ), 2, 
np.where( ((diff <  - 0.35), 3) ))) 

But it reports the following error:

SystemError: <built-in function where> returned a result with an error set          

How can I fix it?

Upvotes: 3

Views: 441

Answers (2)

katzenjammer
katzenjammer

Reputation: 205

One can also simply use numpy.searchsorted:

diff_classes = [-0.35,0,0.35]
def getClass(x):
    return len(diff_classes)-np.searchsorted(diff_classes,x)

df["class"]=diff.apply(getClass)

searchsorted will give you the index of x in the diff_classes list, which you then substract from 3 to get your desired result.

edit: A little bit less readable, but it also works in one line:

df["class"] = diff.apply(lambda x: 3-np.searchsorted([-0.35,0,0.35],x))

Upvotes: 1

jpp
jpp

Reputation: 164823

You can use numpy.select to specify conditions and values separately.

s = (df['col_1'] / df['col_2']) - 1

conditions = [s.between(0, 0.35), s > 0.35, s.between(-0.35, 0), s < -0.35]
values = [0, 1, 2, 3]

df['Class'] = np.select(conditions, values, np.nan)

Upvotes: 4

Related Questions