Reputation: 53
I'm having trouble finding a good way to find all negative entries in a column and move them up the column, summing them up with the existing entry (i.e. subtracting the negative entry from the present entry) until all values are positive.
It is important that there are no negative values for the final dataframe & that all previously negative entries = 0. Also, the table is repeating, which means that I need to aggregate the results both based on the ID as well as on the entries (only do summations of entries of the same ID).
Based on an already presented table here:
Present:
ID | Date | Entries |
---|---|---|
1 | 2013 | 100 |
1 | 2014 | 0 |
1 | 2015 | 60 |
1 | 2016 | -30 |
1 | 2017 | 0 |
1 | 2018 | 50 |
1 | 2019 | 0 |
1 | 2020 | -20 |
2 | 2013 | 100 |
2 | 2014 | 0 |
2 | 2015 | 60 |
2 | 2016 | -30 |
2 | 2017 | 0 |
2 | 2018 | 50 |
2 | 2019 | 0 |
2 | 2020 | -20 |
Desired:
ID | Date | Entries |
---|---|---|
1 | 2013 | 100 |
1 | 2014 | 0 |
1 | 2015 | 30 |
1 | 2016 | 0 |
1 | 2017 | 0 |
1 | 2018 | 30 |
1 | 2019 | 0 |
1 | 2020 | 0 |
2 | 2013 | 100 |
2 | 2014 | 0 |
2 | 2015 | 30 |
2 | 2016 | 0 |
2 | 2017 | 0 |
2 | 2018 | 30 |
2 | 2019 | 0 |
2 | 2020 | 0 |
Upvotes: 5
Views: 721
Reputation: 31146
A straight forward recursive function on the array of values
df = pd.read_csv(io.StringIO("""ID Date Entries
1 2013 100
1 2014 0
1 2015 60
1 2016 -30
1 2017 0
1 2018 50
1 2019 0
1 2020 -20
2 2013 100
2 2014 0
2 2015 60
2 2016 -30
2 2017 0
2 2018 50
2 2019 0
2 2020 -20"""), sep="\t")
def shiftminus(a):
touch=False
for i,n in enumerate(a):
if n<0 and i>0:
a[i-1] += a[i]
a[i] = 0
touch=True
if touch:
a = shiftminus(a)
return a
df["Entries"] = shiftminus(df["Entries"].values)
ID Date Entries
1 2013 100
1 2014 0
1 2015 30
1 2016 0
1 2017 0
1 2018 30
1 2019 0
1 2020 0
2 2013 100
2 2014 0
2 2015 30
2 2016 0
2 2017 0
2 2018 30
2 2019 0
2 2020 0
Upvotes: 0
Reputation: 75080
You can try reverse cumsum after creating a group, then mask:
s = df['Entries'].gt(0).cumsum()
u= df['Entries'][::-1].groupby(s).cumsum().mask(df['Entries'].le(0),0)
out = df.assign(New_Entries=u) # you can assign to the original column too.
print(out)
ID Date Entries New_Entries
0 1 2013 100 100
1 1 2014 0 0
2 1 2015 60 30
3 1 2016 -30 0
4 1 2017 0 0
5 1 2018 50 30
6 1 2019 0 0
7 1 2020 -20 0
8 2 2013 100 100
9 2 2014 0 0
10 2 2015 60 30
11 2 2016 -30 0
12 2 2017 0 0
13 2 2018 50 30
14 2 2019 0 0
15 2 2020 -20 0
Upvotes: 8