gabi
gabi

Reputation: 1113

Pandas DataFrame, iterate columns and check condition

I have DataFrame of product prices, Which have MultiIndex of the product and date, and the columns is sales on several stores.

For Example:

             d1        d2        d3        d4        d5        d6 
Date 
2000-01-03 -0.084321 -0.070914 -0.151308 -0.111111 -0.126745 -0.171419 
2000-01-04  0.014642 -0.073155 -0.029257 -0.046330 -0.095118 -0.148168 
2000-01-05 -0.086530 -0.043265 -0.060092 -0.108177 -0.160461 -0.067307 
2000-01-06  0.057364  0.028943 -0.023697 -0.080934  0.021044  0.057237 
2000-01-07 -0.017588 -0.017847 -0.002495 -0.025129  0.009427  0.009427

I want to see the difference of > 5%, or < -5%, but I need it to be sequential, meaning - first check d1, if not match, check d2...

The final result should be new Series with the numpy.sign result of the field (so -1 for < -5%, and +1 for >5%)

            Signal    
Date 
2000-01-03  -1
2000-01-04  -1
2000-01-05  -1
2000-01-06   1
2000-01-07   0

I am trying to do it with apply function of DataFrame, but I cannot understand how to do it sequentially on the columns.

Does anyone have an idea how to solve this?
Thanks.

Upvotes: 0

Views: 65

Answers (2)

Anton vBR
Anton vBR

Reputation: 18906

Reading the docs I think np.sign is used for a breakpoint and not a range as you want in this case.

df.apply accepts a function (lambda or def) and you should also send the axis param (0 for col, 1 for row):

import pandas as pd

df = pd.DataFrame({'d1': {'2000-01-03': -0.08432100000000001,
  '2000-01-04': 0.014641999999999999,
  '2000-01-05': -0.08653,
  '2000-01-06': 0.047363999999999996,
  '2000-01-07': -0.017588},
 'd2': {'2000-01-03': -0.070914,
  '2000-01-04': -0.073155,
  '2000-01-05': -0.043265,
  '2000-01-06': 0.028943,
  '2000-01-07': -0.017847},
 'd3': {'2000-01-03': -0.151308,
  '2000-01-04': -0.029257,
  '2000-01-05': -0.06009199999999999,
  '2000-01-06': -0.023697,
  '2000-01-07': -0.002495},
 'd4': {'2000-01-03': -0.11111099999999999,
  '2000-01-04': -0.04633,
  '2000-01-05': -0.108177,
  '2000-01-06': -0.080934,
  '2000-01-07': -0.025129},
 'd5': {'2000-01-03': -0.126745,
  '2000-01-04': -0.095118,
  '2000-01-05': -0.160461,
  '2000-01-06': 0.021044,
  '2000-01-07': 0.009427},
 'd6': {'2000-01-03': -0.17141900000000002,
  '2000-01-04': -0.148168,
  '2000-01-05': -0.067307,
  '2000-01-06': 0.057237,
  '2000-01-07': 0.009427}})

Code:

def ret_Signal(x):
    for i in x:
        if i > 0.05:
            return 1
        elif i < -0.05:
            return -1
    return 0

df.apply(ret_Signal, axis = 1)

Which returns:

2000-01-03   -1
2000-01-04   -1
2000-01-05   -1
2000-01-06   -1
2000-01-07    0

Upvotes: 1

Madaray
Madaray

Reputation: 105

That not the prettiest way but you can call the apply function with a lambda with multiple if statment, like this :

df['Signal'] = df.apply(lambda x: 1 if x.d1 > 0.1 else -1 if x.d2 < -0.1 else 0, axis=1)

Upvotes: 0

Related Questions