S. Melted
S. Melted

Reputation: 293

How can I interpolate missing values in a column in Power Query / Power BI?

There is a post out there that describes how to do it in a very specific example:

https://community.powerbi.com/t5/Community-Blog/Linear-Interpolation-with-Power-BI/ba-p/341202

But the code isn't very portable as it refers to specific columns by name etc.

It also doesn't package the code as a function so your power query will be littered with a bunch of extra steps and variables.

Upvotes: 1

Views: 3069

Answers (1)

S. Melted
S. Melted

Reputation: 293

I have written a (relatively) generic function for interpolating values in power query (also useful for power bi and m code).

It takes a table and two column names as inputs. It outputs the table having interpolated missing y values based on the nearest x, y pairs - as indicated by the order that it was passed the data (not numerical closeness). If you need numerical closeness, just sort by x (and possibly buffer) before passing to this function.

(Input as table, xColumn as text, yColumn as text) =>
//Interpolates missing yColumn values based on nearest existing xColumn, yColumn pairs
let
    Buffer = Table.Buffer(Input),
    //index for joining calculations and preserving original order
    #"Added Main Index" = Table.AddIndexColumn(Buffer, "InterpolateMainIndex", 0, 1),
    #"Two Columns and Index" = Table.RemoveColumns(#"Added Main Index", List.Select(Table.ColumnNames(#"Added Main Index"), each _ <> xColumn and _ <> yColumn and _ <> "InterpolateMainIndex")),
    #"Remove Blanks" = Table.SelectRows(#"Two Columns and Index", each Record.Field(_, yColumn) <> null and Record.Field(_, yColumn) <> ""),
    //index for refering to next non-blank record
    #"Added Sub Index" = Table.AddIndexColumn(#"Remove Blanks", "InterpolateSubIndex", 0, 1),
    //m = (y2 - y1) / (x2 - x1)
    m = Table.AddColumn(#"Added Sub Index",
                        "m",
                        each    (Number.From(Record.Field(_, yColumn))-Number.From(Record.Field(#"Added Sub Index"{[InterpolateSubIndex]+1}, yColumn))) / 
                                (Number.From(Record.Field(_, xColumn))-Number.From(Record.Field(#"Added Sub Index"{[InterpolateSubIndex]+1}, xColumn))),
                        type number),
    //b = y - m * x
    b = Table.AddColumn(m, "b", each Record.Field(_, yColumn) - [#"m"] * Number.From(Record.Field(_, xColumn)), type number),
    //rename  or remove columns to allow full join
    #"Renamed Columns" = Table.RenameColumns(b,{{"InterpolateMainIndex", "InterpolateMainIndexCopy"}}),
    xColumnmb = Table.RemoveColumns(#"Renamed Columns",{yColumn, xColumn, "InterpolateSubIndex"}),
    Join = Table.Join(#"Added Main Index", "InterpolateMainIndex", xColumnmb, "InterpolateMainIndexCopy", JoinKind.FullOuter),
    //enforce orignal sorting
    #"Sorted by Main Index" = Table.Sort(Join,{{"InterpolateMainIndex", Order.Ascending}}),
    #"Filled Down mb" = Table.FillDown(#"Sorted by Main Index",{"m", "b"}),
    //y = m * x + b
    Interpolate = Table.ReplaceValue(#"Filled Down mb",null,each ([m] * Number.From(Record.Field(_, xColumn)) + [b]),Replacer.ReplaceValue,{yColumn}),
    //clean up
    #"Remove Temporary Columns" = Table.RemoveColumns(Interpolate,{"m", "b", "InterpolateMainIndex", "InterpolateMainIndexCopy"}),
    #"Restore Types" = Value.ReplaceType(#"Remove Temporary Columns", Value.Type(Input))
in
    #"Restore Types"

Upvotes: 2

Related Questions