Sachin Verma
Sachin Verma

Reputation: 3802

pandas series add with previous row on condition

I need to add a series with previous rows only if a condition matches in current cell. Here's the dataframe:

import pandas as pd
        data = {'col1': [1, 2, 1, 0, 0, 0, 0, 3, 2, 2, 0, 0]}
        df = pd.DataFrame(data, columns=['col1'])
        df['continuous'] = df.col1
        print(df)

I need to +1 a cell with previous sum if it's value > 0 else -1. So, result I'm expecting is;

   col1  continuous
0      1           1//+1 as its non-zero
1      2           2//+1 as its non-zero
2      1           3//+1 as its non-zero
3      0           2//-1 as its zero
4      0           1
5      0           0
6      0           0// not to go less than 0
7      3           1
8      2           2
9      2           3
10     0           2
11     0           1

Case 2 : where I want instead of >0 , I need <-0.1

data = {'col1': [-0.097112634,-0.092674324,-0.089176841,-0.087302284,-0.087351866,-0.089226185,-0.092242213,-0.096446987,-0.101620036,-0.105940337,-0.109484752,-0.113515648,-0.117848816,-0.121133266,-0.123824577,-0.126030136,-0.126630895,-0.126015218,-0.124235003,-0.122715224,-0.121746573,-0.120794916,-0.120291174,-0.120323152,-0.12053229,-0.121491186,-0.122625851,-0.123819704,-0.125751858,-0.127676591,-0.129339428,-0.132342431,-0.137119556,-0.142040092,-0.14837848,-0.15439201,-0.159282645,-0.161271982,-0.162377701,-0.162838307,-0.163204393,-0.164095634,-0.165496071,-0.167224488,-0.167057078,-0.165706164,-0.163301617,-0.161423938,-0.158669389,-0.156508912,-0.15508329,-0.15365104,-0.151958972,-0.150317528,-0.149234892,-0.148259354,-0.14737422,-0.145958527,-0.144633388,-0.143120273,-0.14145652,-0.139930163,-0.138774126,-0.136710524,-0.134692221,-0.132534879,-0.129921444,-0.127974949,-0.128294058,-0.129241763,-0.132263506,-0.137828981,-0.145549768,-0.154244588,-0.163125109,-0.171814857,-0.179911465,-0.186223859,-0.190653162,-0.194761064,-0.197988536,-0.200500606,-0.20260121,-0.204797089,-0.208281065,-0.211846904,-0.215312626,-0.218696339,-0.221489975,-0.221375209,-0.220996031,-0.218558429,-0.215936558,-0.213933531,-0.21242896,-0.209682125,-0.208196607,-0.206243585,-0.202190476,-0.19913106,-0.19703291,-0.194244664,-0.189609518,-0.186600526,-0.18160171,-0.175875689,-0.170767095,-0.167453329,-0.163516985,-0.161168703,-0.158197984,-0.156378046,-0.154794499,-0.153236804,-0.15187487,-0.151623385,-0.150628282,-0.149039072,-0.14826268,-0.147535739,-0.145557646,-0.142223729,-0.139343068,-0.135355686,-0.13047743,-0.125999173,-0.12218752,-0.117021996,-0.111542982,-0.106409901,-0.101904095,-0.097910825,-0.094683375,-0.092079967,-0.088953862,-0.086268097,-0.082907394,-0.080723466,-0.078117426,-0.075431993,-0.072079536,-0.068962411,-0.064831759,-0.061257701,-0.05830671,-0.053889968,-0.048972414,-0.044763431,-0.042162829,-0.039328369,-0.038968862,-0.040450835,-0.041974942,-0.042161609,-0.04280523,-0.042702428,-0.042593856,-0.043166561,-0.043691795,-0.044093492,-0.043965231,-0.04263305,-0.040836102,-0.039605133,-0.037204273,-0.034368645,-0.032293737,-0.029037983,-0.025509509,-0.022704668,-0.021346266,-0.019881524,-0.018675734,-0.017509566,-0.017148129,-0.016671088,-0.016015011,-0.016241862,-0.016416445,-0.016548878,-0.016475455,-0.016405742,-0.015567737,-0.014190101,-0.012373151,-0.010370329,-0.008131459,-0.006729419,-0.005667607,-0.004883919,-0.004841328,-0.005403019,-0.005343759,-0.005377974,-0.00548823,-0.004889709,-0.003884973,-0.003149113,-0.002975268,-0.00283163,-0.00322658,-0.003546589,-0.004233582,-0.004448617,-0.004706967,-0.007400356,-0.010104064,-0.01230257,-0.014430498,-0.016499501,-0.015348355,-0.013974229,-0.012845464,-0.012688459,-0.012552231,-0.013719074,-0.014404172,-0.014611632,-0.013401283,-0.011807386,-0.007417753,-0.003321279,0.000363954,0.004908491,0.010151584,0.013223831,0.016746553,0.02106351,0.024571507,0.027588073,0.031313637,0.034419301,0.037016545,0.038172954,0.038237253,0.038094387,0.037783779,0.036482515,0.036080763,0.035476154,0.034107081,0.03237083,0.030934259,0.029317076,0.028236195,0.027850758,0.024612491,0.01964433,0.015153308,0.009684456,0.003336172]}
        df = pd.DataFrame(data, columns=['col1'])
        lim = float(-0.1)
        s = df['col1'].lt(lim)
        out = s.where(s, -1).cumsum()
        df['sol'] = out - out.where((out < 0) & (~s)).ffill().fillna(0)
        print(df)

Upvotes: 1

Views: 177

Answers (2)

robertwest
robertwest

Reputation: 942

You can do this using cumsum function on booleans:

Give me a +1 whenever col1 is not zero:

(df.col1 != 0 ).cumsum()

Give me a -1 whenever col1 is zero:

- (df.col1 == 0 ).cumsum()

Then just add them together!

df['continuous'] = (df.col1 != 0 ).cumsum() - (df.col1 == 0 ).cumsum()

However this does not resolve the dropping below zero criteria you mentioned

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150785

The key problem here, to me, is to control the out not to go below zero. With that in mind, we can mask the output where it's negative and adjust accordingly:

# a little longer data for corner case
df = pd.DataFrame({'col1': [1, 2, 1, 0, 0, 0, 0, 3, 2, 2, 0, 0,0,0,0,2,3,4]})


s = df.col1.gt(0)
out = s.where(s,-1).cumsum()
df['continuous'] = out - out.where((out<0)&(~s)).ffill().fillna(0)

Output:

    col1 continuous
0      1          1
1      2          2
2      1          3
3      0          2
4      0          1
5      0          0
6      0          0
7      3          1
8      2          2
9      2          3
10     0          2
11     0          1
12     0          0
13     0          0
14     0          0
15     2          1
16     3          2
17     4          3

Upvotes: 1

Related Questions