Obiii
Obiii

Reputation: 834

Combine columns from two sources

I have two sources resulting from some transformation in data flow:

enter image description here

I have tried using join, it replicates the data no matter join I select it outputs similar stuff: enter image description here

I have tried union as well but union either creates null in columns (if done by name) or rows (if done by position)

Shouldnt the join just concat the columns together because the IDs are same in both table.

This is how the desired ouput should look: I want concat the version column to the first source so that it looks like this:

 ID       name    value    version
 111     file1      0.1        3
 111     file2      0.82      15
 111     file3      2.2         2

Upvotes: 0

Views: 1692

Answers (2)

NiharikaMoola
NiharikaMoola

Reputation: 5074

Both of your source files have only one matching column (ID) and it is not unique. When you join both sources on the ID column, each row of source1 joins with all the matching rows of source2.

Here, your row1 (111) of source1 joins with all 3 matching rows (111) of source2, hence it results in 9 rows with different version values for each row in source1.

To get only 3 rows as your expected results, you need a unique matching row in each source.

  1. Add window transformation for both sources and get the rowNumber() based on the ID column.

Source1->window1:

enter image description here

enter image description here

enter image description here

Window1 data preview:

enter image description here

Source2->window2:

enter image description here

enter image description here

enter image description here

enter image description here

Window2 data preview:

enter image description here

  1. Add join transformation to join data from window transformations on ID and rank columns.

enter image description here

Join data preview:

enter image description here

  1. Add select transformation to remove the unwanted columns.

enter image description here

Select data preview:

enter image description here

Upvotes: 1

Mark Kromer MSFT
Mark Kromer MSFT

Reputation: 3838

That is expected with a join. For example, when you join tables in SQL, you also supply the target projection as part of the select statement. What you need to do here is add a Select transformation after your Join transformation. In there, you will reduce the projection to just the columns that would like to retain. You'll be able to choose which side (left or right) you would like to keep for the ID column.

Upvotes: 0

Related Questions