StefaM
StefaM

Reputation: 53

Sum negative row values with previous rows pandas

I'm having trouble finding a good way to find all negative entries in a column and move them up the column, summing them up with the existing entry (i.e. subtracting the negative entry from the present entry) until all values are positive.

It is important that there are no negative values for the final dataframe & that all previously negative entries = 0. Also, the table is repeating, which means that I need to aggregate the results both based on the ID as well as on the entries (only do summations of entries of the same ID).

Based on an already presented table here:

Present:

ID Date Entries
1 2013 100
1 2014 0
1 2015 60
1 2016 -30
1 2017 0
1 2018 50
1 2019 0
1 2020 -20
2 2013 100
2 2014 0
2 2015 60
2 2016 -30
2 2017 0
2 2018 50
2 2019 0
2 2020 -20

Desired:

ID Date Entries
1 2013 100
1 2014 0
1 2015 30
1 2016 0
1 2017 0
1 2018 30
1 2019 0
1 2020 0
2 2013 100
2 2014 0
2 2015 30
2 2016 0
2 2017 0
2 2018 30
2 2019 0
2 2020 0

Upvotes: 5

Views: 721

Answers (2)

Rob Raymond
Rob Raymond

Reputation: 31146

A straight forward recursive function on the array of values

df = pd.read_csv(io.StringIO("""ID  Date    Entries
1   2013    100
1   2014    0
1   2015    60
1   2016    -30
1   2017    0
1   2018    50
1   2019    0
1   2020    -20
2   2013    100
2   2014    0
2   2015    60
2   2016    -30
2   2017    0
2   2018    50
2   2019    0
2   2020    -20"""), sep="\t")

def shiftminus(a):
    touch=False
    for i,n in enumerate(a):
        if n<0 and i>0:
            a[i-1] += a[i]
            a[i] = 0
            touch=True
    if touch:
        a = shiftminus(a)
    return a

df["Entries"] = shiftminus(df["Entries"].values)


output

 ID  Date  Entries
  1  2013      100
  1  2014        0
  1  2015       30
  1  2016        0
  1  2017        0
  1  2018       30
  1  2019        0
  1  2020        0
  2  2013      100
  2  2014        0
  2  2015       30
  2  2016        0
  2  2017        0
  2  2018       30
  2  2019        0
  2  2020        0

Upvotes: 0

anky
anky

Reputation: 75080

You can try reverse cumsum after creating a group, then mask:

s = df['Entries'].gt(0).cumsum()
u= df['Entries'][::-1].groupby(s).cumsum().mask(df['Entries'].le(0),0)
out = df.assign(New_Entries=u) # you can assign to the original column too.

print(out)
    ID  Date  Entries  New_Entries
0    1  2013      100          100
1    1  2014        0            0
2    1  2015       60           30
3    1  2016      -30            0
4    1  2017        0            0
5    1  2018       50           30
6    1  2019        0            0
7    1  2020      -20            0
8    2  2013      100          100
9    2  2014        0            0
10   2  2015       60           30
11   2  2016      -30            0
12   2  2017        0            0
13   2  2018       50           30
14   2  2019        0            0
15   2  2020      -20            0

Upvotes: 8

Related Questions