Al_Iskander
Al_Iskander

Reputation: 1001

np.where() or another boolean way for new pandas dataframe column

I would like to create a column in a dataframe with stock prices that is conditional on the boolean values from another calculation.

              close    high   low
    Index
      0        10       11     10
      1        11       12     10
      2        10       11      9

First I want to define some condition(s) that could logically look like that but in fact are a lengthy :

 condition1: if df.close > df.close.shift() return True

In real I want to define many more conditions that all deliver True or False. Then I include that in the np.where():

 df['NewColumn'] = np.where(condition1() == True, 'A', 'B')

I tried to define the condition as a function but did not manage to correctly set it up. I would like to avoid to write the content of the condition directly into the np.where() because it would become too complex with several nested conditions.

So, how can I accomplish my task most efficiently?

Edit: a function could look like this (but it does not work in the np.where() above):

def condition1(): 
    if df.Close > df.Close.shift(1):
        return True
    Else
        return False

Upvotes: 0

Views: 11134

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

UPDATE: IMO you don't need "conditions" as functions:

In [89]: cond1 = df.Close > df.Close.shift(1)

In [90]: cond2 = df.High < 12

In [91]: df['New_Close'] = np.where(cond1, 'A', 'B')

In [92]: df
Out[92]:
   Close  High  Low New_Close
0     10    11   10         B
1     11    12   10         A
2     10    11    9         B

In [93]: df['New_High'] = np.where(cond1, '<12', '>=12')

In [94]: df
Out[94]:
   Close  High  Low New_Close New_High
0     10    11   10         B     >=12
1     11    12   10         A      <12
2     10    11    9         B     >=12

OLD answer:

I don't really see any advantages (benefit) of this approach, but you can do it this way:

def condition1(): 
    return (df.Close > df.Close.shift(1))

def condition2(): 
    return (df.High < 12)


In [72]: df['New_Close'] = np.where(condition1(), 'A', 'B')

In [73]: df
Out[73]:
   Close  High  Low New_Close
0     10    11   10         B
1     11    12   10         A
2     10    11    9         B

In [74]: df['New_High'] = np.where(condition2(), '<12', '>=12')

In [75]: df
Out[75]:
   Close  High  Low New_Close New_High
0     10    11   10         B      <12
1     11    12   10         A     >=12
2     10    11    9         B      <12

PS IMO it would be easier and nicer to do it directly like @PaulH said in his comment

Upvotes: 2

Related Questions