Przemyslaw Remin
Przemyslaw Remin

Reputation: 6940

Power Query join on the least difference

How to make a Power Query join of two tables on least difference between columns. I mean absolute difference between numbers.

enter image description here

I followed this great article: https://exceed.hr/blog/merging-with-date-range-using-power-query/

I tried adding this custom column analogously, where L stands for Tab1 and R for Tab2:

= Table.AddColumn(
    Source, 
    "LeastAbsDifference", 
    (L) =>
      Table.SelectRows( Tab2, 
         (R) => L[category] = R[category] and Number.Abs(L[target] - R[actual]) )
  )

It produces error: Expression.Error: We cannot convert the value 4 to type Logical.


Tables to recreate example:

// Tab1
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJVitWJVnKCs5whrFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, target = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"target", Int64.Type}})
in
    #"Changed Type"
// Tab2
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJUitWBsIzgLGMwywnIMoGzTOEsM6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, actual = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"actual", Int64.Type}})
in
    #"Changed Type"

Upvotes: 0

Views: 185

Answers (2)

Przemyslaw Remin
Przemyslaw Remin

Reputation: 6940

I would like to acknowledge a favor of Kristian Rados, who provided the answer to my question in the comments of his article: Merging with date range using Power Query With my gratitude, and by courtesy of the author, I am quoting the answer in full:


The reason your formula produces an error is the second argument of the Table.SelectRows function. In it, you need to filter the table with the boolean (true/false) expression. In your case, the part of the code with Number.Abs function returns a number instead of true/false (e.g. L[target] – R[actual] = 5-1=4 ). Trying to filter the table this way could be possible, but it would require you to use multiple nested environments, which would result in a very complicated formula and slow performance.

I would suggest trying a different approach. By using your example from stack overflow I reproduced the problem. Below is a complete M code I came up with along with the explanation below:

let
    Source = Tab1,
    #"Merged Queries" = Table.NestedJoin(Source, {"category"}, Tab2, {"category"}, "Tab2", JoinKind.LeftOuter),
    #"Expanded Tab2" = Table.ExpandTableColumn(#"Merged Queries", "Tab2", {"actual"}, {"actual"}),
    #"Inserted Subtraction" = Table.AddColumn(#"Expanded Tab2", "Least difference", each Number.Abs([target] - [actual]), Int64.Type),
    #"Grouped Rows" = Table.Group(#"Inserted Subtraction", {"category"}, {{"All", each Table.First(Table.Sort(_, {{"Least difference", Order.Ascending}}))}}),
    #"Expanded All" = Table.ExpandRecordColumn(#"Grouped Rows", "All", {"target", "actual", "Least difference"}, {"target", "actual", "Least difference"})
in
    #"Expanded All"

First, we merged the queries by using the category column. After expanding the table, we subtracted two columns to get the absolute difference between target and actual. Finally, we group by category and sort the table by the Least difference column in ascending order (Table.Sort function inside the grouped rows). After this, we take the first row of the nested table (Table.First function), and finally expand the record column.


Upvotes: 0

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60174

Here's one way:

  • Append the two tables
  • Group by Category
  • Output the desired columns as a Group aggregation
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJVitWJVnKCs5whrFgA", BinaryEncoding.Base64), 
        Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, target = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"target", Int64.Type}}),
    Source2 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJUitWBsIzgLGMwywnIMoGzTOEsM6XYWAA=", 
        BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, actual = _t]),
    #"Changed Type1" = Table.TransformColumnTypes(Source2,{{"actual", Int64.Type}}),


//append the tables
    append = Table.Combine({#"Changed Type",#"Changed Type1"}),

//Group by category, then output the desired columns
    #"Grouped Rows" = Table.Group(append, {"category"}, {
        {"target", each [target]{0},Int64.Type},
        
        {"actual", (t)=> t[actual]{
            List.PositionOf(List.Transform(t[actual], each Number.Abs(t[target]{0} - _)),
                List.Min(List.Transform(t[actual], each Number.Abs(t[target]{0} - _))),Occurrence.First)},Int64.Type},
        
        {"least difference", (t)=> List.Min(List.Transform(t[actual], each Number.Abs(t[target]{0} - _))),Int64.Type
        
        }})
in
    #"Grouped Rows"

Output from above code
enter image description here

Upvotes: 1

Related Questions