Kiann
Kiann

Reputation: 571

resampling python Pandas along n-th row (or integer multiple)

I wish to re-sample some price-data stream, so that I calculate the product of two columns only for every n-th row.

For example, in my data below; I wish to calculate (in a new column), the value of signal and PrxDiff, only if the index is a multiple of 6 (or some other integer).

ind  signal oldnal  time                price   PrxDiff     cnt
0   -1     4       2018-08-14 08:00:06  2.6575  7.525870    0
1   -1     3       2018-08-14 08:00:16  2.6575  7.525870    1
2   -2     2       2018-08-14 08:00:26  2.6585  3.761520    2
3   -1     1       2018-08-14 08:00:36  2.6585  3.761520    3
4   -4     1       2018-08-14 08:00:46  2.6585  3.761520    4
5   1      0       2018-08-14 08:00:56  2.6585  3.761520    5
6   -3     3       2018-08-14 08:01:06  2.6595  0.000000    0
7   0      2       2018-08-14 08:01:16  2.6595  0.000000    1
8   -3     3       2018-08-14 08:01:26  2.6595  0.000000    2

What I have tried, is to generate a 'remainder' value, and then use a if loop (for every row), to check if the cnt == 0.

dataT['cnt'] = dataT.index % 6
for row in dataT.index:
    if dataT.cnt[row] == 0:
        dataT.cnt[row] = dataT.PrxDiff[row] * dataT.signal[row]
    else:
        dataT.cnt[row] == 0

dataT

But two problems, the column cnt becomes an integer (and the original values doesnt seem to be set to zero), and the calculations seem to take forever (for some reason).

   ind  signal  oldnal  time                price   PrxDiff     cnt
    0   -1      4       2018-08-14 08:00:06 2.6575  7.525870    -7.0
    1   -1      3       2018-08-14 08:00:16 2.6575  7.525870    1.0
    2   -2      2       2018-08-14 08:00:26 2.6585  3.761520    2.0
    3   -1      1       2018-08-14 08:00:36 2.6585  3.761520    3.0
    4   -4      1       2018-08-14 08:00:46 2.6585  3.761520    4.0
    5   1       0       2018-08-14 08:00:56 2.6585  3.761520    5.0
    6   -3      3       2018-08-14 08:01:06 2.6595  0.000000    0.0
    7   0       2       2018-08-14 08:01:16 2.6595  0.000000    1.0
    8   -3      3       2018-08-14 08:01:26 2.6595  0.000000    2.0
    9   -3      2       2018-08-14 08:01:36 2.6595  3.760105    3.0
    10  -5      1       2018-08-14 08:01:46 2.6595  3.760105    4.0
    11  -2      0       2018-08-14 08:01:56 2.6595  3.760105    5.0
    12  -3      1       2018-08-14 08:02:06 2.6595  3.760105    -11.0

Upvotes: 2

Views: 39

Answers (1)

BENY
BENY

Reputation: 323236

Base of your logic np.where

np.where((df.ind%6)==0,0,df.PrxDiff*df.signal)
Out[268]: 
array([  0.      ,  -7.52587 ,  -7.52304 ,  -3.76152 , -15.04608 ,
         3.76152 ,   0.      ,   0.      ,  -0.      , -11.280315,
       -18.800525,  -7.52021 ,   0.      ])

#df['cnt'] = np.where((df.ind%6)==0,0,df.PrxDiff*df.signal)

Upvotes: 2

Related Questions