Michele Locati
Michele Locati

Reputation: 1844

How to join two tables in PowerQuery with one of many columns matching?

Let's assume we have two PowerQuery queries:

  1. a query named Query1 with columns Col1_1 and Col1_2
  2. a query named Query2 with columns Col2_1 and Col2_2

I know it's possible to join these two queries with a merge query like this:

let
    Source = Table.NestedJoin(Query1,{"Col1_1", "Col1_2"},Query2,{"Col2_1", "Col2_2"},"Query2",JoinKind.LeftOuter)
in
    Source

In SQL, this could be represented as

SELECT
    *
FROM
    Query1
    LEFT JOIN Query2 ON Query1.Col1_1 = Query2.Col2_1 AND Query1.Col1_2 = Query2.Col2_2

Question: is it possible to join the two queries if at least one of the two columns match? In SQL, this could be represented as

SELECT
    *
FROM
    Query1
    LEFT JOIN Query2 ON Query1.Col1_1 = Query2.Col2_1 OR Query1.Col1_2 = Query2.Col2_2

Upvotes: 3

Views: 4624

Answers (2)

Wedge
Wedge

Reputation: 1826

As far as I know there are no options to alter the default method of matching in Join functions in PQ. You could however do two joins on each of the columns you want, then combine the results of those joins.

This would result in duplicate matches when both col1 and col2 match, which I'm not sure if that is the intended result. If not, you can use PQ to setup indexes to catch these duplicates and remove them.

Assuming Query2 also has had an Index added that looks like this:

let
    Source = Query1,
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index",{"col1"},Query2,{"col1"},"col1Join",JoinKind.LeftOuter),
    #"Merged Queries1" = Table.NestedJoin(#"Merged Queries",{"col2"},Query2,{"col2"},"col2Join",JoinKind.LeftOuter),
    #"Added Custom" = Table.AddColumn(#"Merged Queries1", "MergeTables", each Table.Combine({[col1Join],[col2Join]})),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "MergeTables", {"col1", "col2", "Index"}, {"Query2.col1", "Query2.col2", "Query2.Index"}),
    #"Removed Duplicates" = Table.Distinct(#"Expanded Custom", {"Index", "Query2.Index"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index", "col1Join", "col2Join", "Query2.Index"})
in
    #"Removed Columns"

Not a very scale-able solution, but I think it works correctly?

Upvotes: 1

Alexis Olson
Alexis Olson

Reputation: 40204

Instead of doing a table join, another possibility would be to add a custom column using the logic you want:

Table.AddColumn(Source, "JoinQuery2",
    (Q1) => Table.SelectRows(Query2,
                each Q1[Col1_1] = [Col2_1] or Q1[Col1_2] = [Col2_2]
            )
)

You can then expand this column of tables just like if you did a merge.


I got this idea from this blog post: Join Conditions In Power Query, Part 1

Upvotes: 1

Related Questions