Garrad2810
Garrad2810

Reputation: 113

In Pandas dataframe, how to append a new column of True / False based on each row's value?

I'm trying to create a dataframe of stock prices, and append a True/False column for each row based on certain conditions.

ind = [0,1,2,3,4,5,6,7,8,9]
close = [10,20,30,40,30,20,30,40,50]
open = [11,21,31,41,31,21,31,41,51]
upper = [11,21,31,41,31,21,31,41,51]
mid = [11,21,31,41,31,21,31,41,51]
cond1 = [True,True,True,False,False,True,False,True,True]
cond2 = [True,True,False,False,False,False,False,False,False]
cond3 = [True,True,False,False,False,False,False,False,False]
cond4 = [True,True,False,False,False,False,False,False,False]
cond5 = [True,True,False,False,False,False,False,False,False]

def check_conds(df, latest_price):
    ''''1st set of INT for early breakout of bollinger upper''' 
    df.loc[:, ('cond1')] = df.close.shift(1) > df.upper.shift(1)
    df.loc[:, ('cond2')] = df.open.shift(1) < df.mid.shift(1).rolling(6).min()
    df.loc[:, ('cond3')] = df.close.shift(1).rolling(7).min() <= 21
    df.loc[:, ('cond4')] = df.upper.shift(1) < df.upper.shift(2)
    df.loc[:, ('cond5')] = df.mid.tail(3).max() < 30
    df.loc[:, ('Overall')] = all([df.cond1,df.cond2,df.cond3,df.cond4,df.cond5])    
    return df

The original 9 rows by 4 columns dataframe contains only the close / open / upper / mid columns.

that check_conds functions returns the df nicely with the new cond1-5 columns returning True / False appended for each row, resulting in a dataframe with 9 rows by 9 columns.

However when I tried to apply another logic to provide an 'Overall' True / False based on cond1-5 for each row, I receive that "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

df.loc[:, ('Overall')] = all([df.cond1,df.cond2,df.cond3,df.cond4,df.cond5])

So I tried pulling out each of the cond1-5, those are indeed series of True / False. How do I have that last line in the function to check each row's cond1-5 and return a True if all cond1-5 are True for that row?

Just can't wrap my head why those cond1-5 lines in the function works ok, just comparing the values within each row, but this above last line (written in similar style) is returning an entire series.

Please advise!

Upvotes: 1

Views: 694

Answers (1)

gofvonx
gofvonx

Reputation: 1439

The error tells you to use pd.DataFrame.all. To check that all values are true per row for all conditions you have to specify the argument axis=1:

df.loc[:, df.columns.str.startswith('cond')].all(axis=1)

Note that df.columns.str.startswith('cond') is just a lazy way of selecting all columns that start with 'cond'. Of course you can achieve the same with df[['cond1', 'cond2', 'cond3', 'cond4', 'cond5']].

Upvotes: 1

Related Questions