Mason Sanchez
Mason Sanchez

Reputation: 13

Rolling sum for a window of 2 days

I am trying to compute a rolling 2 day using trans_date sum against the amount column that is grouped by ID within the table below using python.

<table><tbody><tr><th>ID</th><th>Trans_Date</th><th>Trans_Time</th><th>Amount</th><th> </th></tr><tr><td>1</td><td>03/23/2019</td><td>06:51:03</td><td>100</td><td> </td></tr><tr><td>1</td><td>03/24/2019</td><td>12:32:48</td><td>600</td><td> </td></tr><tr><td>1</td><td>03/24/2019</td><td>14:15:35</td><td>50</td><td> </td></tr><tr><td>1</td><td>06/05/2019</td><td>16:18:21</td><td>75</td><td> </td></tr><tr><td>2</td><td>02/01/2019</td><td>18:02:52</td><td>200</td><td> </td></tr><tr><td>2</td><td>02/02/2019</td><td>10:03:02</td><td>150</td><td> </td></tr><tr><td>2</td><td>02/03/2019</td><td>23:47:51</td><td>800</td><td> </td></tr><tr><td>3</td><td>01/18/2019</td><td>11:12:58</td><td>1000</td><td> </td></tr><tr><td>3</td><td>01/23/2019</td><td>22:12:41</td><td>15</td><td> </td></tr></tbody></table>

Ultimately, I am trying to achieve the result below using

<table><tbody><tr><th>ID</th><th>Trans_Date</th><th>Trans_Time</th><th>Amount</th><th>2d_Running_Total</th><th> </th></tr><tr><td>1</td><td>03/23/2019</td><td>06:51:03</td><td>100</td><td>100</td><td> </td></tr><tr><td>1</td><td>03/24/2019</td><td>12:32:48</td><td>600</td><td>700</td><td> </td></tr><tr><td>1</td><td>03/24/2019</td><td>14:15:35</td><td>250</td><td>950</td><td> </td></tr><tr><td>1</td><td>06/05/2019</td><td>16:18:21</td><td>75</td><td>75</td><td> </td></tr><tr><td>2</td><td>02/01/2019</td><td>18:02:52</td><td>200</td><td>200</td><td> </td></tr><tr><td>2</td><td>02/02/2019</td><td>10:03:02</td><td>150</td><td>350</td><td> </td></tr><tr><td>2</td><td>02/03/2019</td><td>23:47:51</td><td>800</td><td>950</td><td> </td></tr><tr><td>3</td><td>01/18/2019</td><td>11:12:58</td><td>1000</td><td>1000</td><td> </td></tr><tr><td>3</td><td>01/23/2019</td><td>22:12:41</td><td>15</td><td>15</td><td> </td></tr></tbody></table>

This hyperlink was very close to solving this, but the issue is for the records that have multiple transactions on the same day, it provides the same value for the same day. https://python-forum.io/Thread-Rolling-sum-for-a-window-of-2-days-Pandas

Upvotes: 1

Views: 752

Answers (1)

RubenB
RubenB

Reputation: 525

This should do it:

import pandas as pd

# create dummy data
df = pd.DataFrame(
    columns = ['ID', 'Trans_Date', 'Trans_Time', 'Amount'],
    data = [
        [1, '03/23/2019', '06:51:03', 100],
        [1, '03/24/2019', '12:32:48', 600],
        [1, '03/24/2019', '14:15:35', 250],
        [1, '06/05/2019', '16:18:21', 75],
        [2, '02/01/2019', '18:02:52', 200],
        [2, '02/02/2019', '10:03:02', 150],
        [2, '02/03/2019', '23:47:51', 800],
        [3, '01/18/2019', '11:12:58', 1000],
        [3, '01/23/2019', '22:12:41', 15]
    ]
)

df_out = pd.DataFrame(
    columns = ['ID', 'Trans_Date', 'Trans_Time', 'Amount', '2d_Running_Total'],
    data = [
        [1, '03/23/2019', '06:51:03', 100, 100],
        [1, '03/24/2019', '12:32:48', 600, 700],
        [1, '03/24/2019', '14:15:35', 250, 950],
        [1, '06/05/2019', '16:18:21', 75, 75],
        [2, '02/01/2019', '18:02:52', 200, 200],
        [2, '02/02/2019', '10:03:02', 150, 350],
        [2, '02/03/2019', '23:47:51', 800, 950],
        [3, '01/18/2019', '11:12:58', 1000, 1000]
    ]
)

# convert into datetime object and set as index
df['Trans_DateTime'] = pd.to_datetime(df['Trans_Date'] + ' ' + df['Trans_Time'])
df = df.set_index('Trans_DateTime')

# group by ID and apply rolling window to the amount column
df['2d_Running_Total'] = df.groupby('ID')['Amount'].rolling('2d').sum().values.astype(int)
df.reset_index(drop=True)

Upvotes: 1

Related Questions