How to sum column in reverse order in pandas with groupby

Question

I currently need to replicate this dataset where I must groupby subjectID copy and count how many have a score of 1 in the future. I must count them in reverse basically however I'm not sure how to do that and groupby subject ID at the same time.

   SubjectID copy  Score  Number of All Future Hosp for O column
0        phchp003      1                                       4
1        phchp003      1                                       3
2        phchp003      1                                       2
3        phchp003      1                                       1
4        phchp003      1                                       0
5        phchp004      1                                       4
6        phchp004      1                                       3
7        phchp004      1                                       2
8        phchp004      1                                       1
9        phchp004      1                                       0
10       phchp006      0                                       3
11       phchp006      0                                       3
12       phchp006      0                                       3
13       phchp006      0                                       3
14       phchp006      1                                       2
15       phchp006      1                                       1
16       phchp006      1                                       0

I currently have

data['Sum']= data.groupby(['SubjectID copy'])['Score'].cumsum()

which gives me the values but summed in descending order, where I need mine to go from bottom up.

Henry Ecker · Accepted Answer

We can use loc to reverse before using groupby transform. We can then use shift and cumsum to only consider "future" values:

data['Sum'] = (
    data.loc[::-1]  # Reverse DataFrame
        .groupby(['SubjectID copy'])['Score']  # Groups
        .transform(lambda s: s.shift(fill_value=0).cumsum())  # handle transformation
)

data:

   SubjectID copy  Score  Sum
0        phchp003      1    4
1        phchp003      1    3
2        phchp003      1    2
3        phchp003      1    1
4        phchp003      1    0
5        phchp004      1    4
6        phchp004      1    3
7        phchp004      1    2
8        phchp004      1    1
9        phchp004      1    0
10       phchp006      0    3
11       phchp006      0    3
12       phchp006      0    3
13       phchp006      0    3
14       phchp006      1    2
15       phchp006      1    1
16       phchp006      1    0

How to sum column in reverse order in pandas with groupby

Answers (2)

Related Questions