Arjun Vasudevan
Arjun Vasudevan

Reputation: 872

Iterative calculation in power query

I've a data set:

Policy,     EXPIRY_DT,  TIMESTAMP, Investment
Prudential, 10/31/2019, 8/9/2019,  0
Prudential, 10/31/2019, 8/8/2019,  0
Prudential, 10/31/2019, 8/7/2019,  0
Prudential, 10/31/2019, 8/6/2019,  -20550
Prudential, 10/31/2019, 8/5/2019,  0
Prudential, 10/31/2019, 8/2/2019,  21300
Prudential, 10/31/2019, 8/1/2019,  30000
Prudential, 10/31/2019, 7/31/2019, 0
Prudential, 10/31/2019, 7/30/2019, 29250
Prudential, 10/31/2019, 7/29/2019, 0
Prudential, 10/31/2019, 7/26/2019, 33900

My need is:

for a particular "policy" for a particular "expiry date",
if today's "investment" + previous day's "cumulative investment" < 0,
then today's "cumulative investment" = 0,
else today's "cumulative investment" = today's "investment" + previous day's "cumulative investment"

The output should look like:

Policy,     EXPIRY_DT,  TIMESTAMP, Investment, Cumulative Investment
Prudential, 10/31/2019, 8/9/2019,  0,          93900
Prudential, 10/31/2019, 8/8/2019,  0,          93900
Prudential, 10/31/2019, 8/7/2019,  0,          93900
Prudential, 10/31/2019, 8/6/2019,  -20550,     93900
Prudential, 10/31/2019, 8/5/2019,  0,          114450
Prudential, 10/31/2019, 8/2/2019,  21300,      114450
Prudential, 10/31/2019, 8/1/2019,  30000,      93150
Prudential, 10/31/2019, 7/31/2019, 0,          63150
Prudential, 10/31/2019, 7/30/2019, 29250,      63150
Prudential, 10/31/2019, 7/29/2019, 0,          33900
Prudential, 10/31/2019, 7/26/2019, 33900,      33900

I'm able to achieve it in Excel worksheet, but is there a way to achieve inside Powerquery?

Upvotes: 1

Views: 721

Answers (2)

Alexis Olson
Alexis Olson

Reputation: 40204

In Power Query, if you aren't afraid of the M language, then you can do this efficiently in O(n). There are simpler ways to do this in M as O(n^2) if performance isn't a concern.


This solution is a good example of how to do this using the List.Accumulate function.

let
    StartingTable = <Table Source Goes Here>,
    SortAndBuffer = Table.Buffer(Table.Sort(StartingTable,{{"TIMESTAMP", Order.Ascending}})),
    TableType = Value.Type(Table.AddColumn(SortAndBuffer, "Cumulative Investment", each null, type number)),
    Cumulative = List.Skip(List.Accumulate(SortAndBuffer[Investment],{0},(cumulative,Investment) => cumulative & {List.Last(cumulative) + Investment})),
    AddCumulativeCol = Table.FromColumns(Table.ToColumns(SortAndBuffer)&{Cumulative},TableType)
in
    AddCumulativeCol

The critical part of this code is this:

Cumulative =
    List.Skip(
        List.Accumulate(
            SortAndBuffer[Investment],
            {0},
            (cumulative,Investment) => cumulative & {List.Last(cumulative) + Investment}
        )
    )

The List.Accumulate function starts with the single element list {0} and for each value in the Investment column list, it takes that (initially single-value) list and adds another value of whatever the last value was plus the current Investment value.

This generates the list

{0,33900,33900,63150,63150,93150,114450,114450,93900,93900,93900,93900}

and List.Skip chops off that initial zero.

Finally, this list gets added as a column to the starting table using Table.FromColumns.


In DAX, you can write a fairly simple calculated column as follows:

Cumulative Investment =
VAR CumulativeSum =
    CALCULATE (
        SUM ( 'Table'[Investment] ),
        ALLEXCEPT ( 'Table', 'Table'[Policy], 'Table'[EXPIRY_DT] ),
        'Table'[TIMESTAMP] <= EARLIER ( 'Table'[TIMESTAMP] )
    )
RETURN MAX(CumulativeSum,0)

Upvotes: 1

Jelle Hoekstra
Jelle Hoekstra

Reputation: 672

Although I don't see how your pseudo code would result in the output you specified I believe this might help you get started. Doing this in PowerQuery would be hard for not impossible to do efficiently for as far as I know. The most efficient way of accomplishing this would be to create a calculated column with the following code.

Cummulative Investment = 
    var _thisPeriod = 
        CALCULATE (
            SUM ( Investments[Investment] ) ;
            ALL ( Investments ) ; 
            Investments[Policy] = EARLIER( Investments[Policy] ) ; 
            Investments[EXPIRY_DT] = EARLIER( Investments[EXPIRY_DT] ) ;
            Investments[TIMESTAMP] = EARLIER( Investments[TIMESTAMP] ) 
        )
    var _prevPeriod = 
        CALCULATE (
            SUM ( Investments[Investment] ) ;
            ALL ( Investments ) ; 
            Investments[Policy] = EARLIER( Investments[Policy] ) ; 
            Investments[EXPIRY_DT] = EARLIER( Investments[EXPIRY_DT] ) ;
            Investments[TIMESTAMP] <= EARLIER( Investments[TIMESTAMP] ) - 1 
        )

RETURN
    IF ( _thisPeriod + _prevPeriod < 0 ; 0 ; _thisPeriod + _prevPeriod )

I don't know if this is precisely what you are looking for but it should be a good basis to explore yourself

Upvotes: 2

Related Questions