Song Wu
Song Wu

Reputation: 31

Comparing values in columns and returning boolean values

Hi I have four columns with column names(ma5,ma10,ma20,ma60). I would like to see if in each row, it satisfies: ma60>ma20>ma10>ma5(numbers in these columns in this specific row), return 1 if true and 0 if false. so I tried the following:

(eachstockdf['ma5']>eachstockdf['ma10'] and 
eachstockdf['ma10']>eachstockdf['ma20'] and 
eachstockdf['ma20']>eachstockdf['ma60']) *1

I want it return a series with 0 and 1 as indication of whether the condition is met. but it gives the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), 
a.item(), a.any() or a.all().

I know it works when I compare only two rows. For example:

((eachstockdf['ma5']>eachstockdf['ma10'])) *1

but how can I do this to compare four columns? Thanks in advance!!

Upvotes: 2

Views: 835

Answers (2)

Alexander
Alexander

Reputation: 109528

If your dataframe consists only of the four columns ma5, ma10, ma20, and ma60 (or any arbitrary order of consecutive moving averages), then one could compare the values of each column to check if they are greater than (gt) the shifted values of the column to the left (axis=1 for rows). The first column of such a comparison would always be irrelevant as there is nothing to its left, so exclude it via iloc[:, 1:], and then check if all row values are true to ensure the MA of each row is monotonically increasing.

You can convert the boolean values to indicators of 1/0 simply by adding zero to the result (.add(0)) which coerces the result to integers. Alternatively, you can use .astype(int).

Note that this method is much more general than using named columns for comparison, as it allows an arbitrary number of consecutive moving averages with arbitrary windows.

df = pd.DataFrame(np.random.randn(5, 4), columns=['ma5', 'ma10', 'ma20', 'ma60'])

df.gt(df.shift(axis=1), axis=1).iloc[:, 1:].all(axis=1).add(0)

Upvotes: 1

cs95
cs95

Reputation: 402303

You need to use bitwise & operators, pandas overloads those.

Also, each condition must be wrapped in another set of enclosing parens, because of the precedence of bitwise operators.

( (eachstockdf['ma5']  > eachstockdf['ma10']) & 
  (eachstockdf['ma10'] > eachstockdf['ma20']) & 
  (eachstockdf['ma20'] > eachstockdf['ma60'])  ) * 1

As a note, you could also convert bool to int with

(....).astype(int)

Upvotes: 1

Related Questions