hideki
hideki

Reputation: 1330

T-SQL query that includes missing records

Apologies if this a little confusing to follow but hopefully someone can help me with this. For some reason my mind is drawing a blank on this and also typing the problem out might help.

I am doing some ETL stuff and I have data that can be grouped by a specific Id value. Each record in my dataset has a LookupId value that I use to join to another table and get additional information. Pretty standard INNER JOIN stuff here. However, I need to also include missing LookupId values in my dataset. This turns my INNER JOIN into a LEFT JOIN which is also pretty standard stuff.

You can use the following sample schema and data below to get an idea of what I am looking to do:

-- Create sample schema
CREATE TABLE ETL_DataSet (
    Id INT NOT NULL,
    LookupId INT NULL,
    LookupCode CHAR(1) NULL,
    LookupDesc NVARCHAR(40) NULL
);

CREATE TABLE ETL_LookupTable (
    LookupId INT NOT NULL
    , LookupCode CHAR(1) NOT NULL
    , LookupDesc NVARCHAR(40) NOT NULL
    , CONSTRAINT PK_LookupTable_LookupId PRIMARY KEY CLUSTERED (LookupId)
);

-- Insert sampel data
INSERT INTO ETL_LookupTable (LookupId, LookupCode, LookupDesc) VALUES
    (1, 'Z', 'Z Description'),
    (2, 'A', 'A Description'),
    (3, 'B', 'B Description'),
    (4, 'C', 'C Description'),
    (5, 'D', 'D Description'),
    (6, 'E', 'E Description'),
    (7, 'X', 'X Description');

INSERT INTO ETL_DataSet (Id, LookupId, LookupCode, LookupDesc) VALUES 
    (1, 3, 'B', 'B Description'),
    (1, 5, 'D', 'D Description'),
    (1, 3, 'B', 'B Description'),
    (1, 2, 'A', 'A Description'),
    (2, 4, 'C', 'C Description'),
    (2, 6, 'E', 'E Description'),
    (2, 3, 'B', 'B Description'),
    (2, 2, 'A', 'A Description');

And here is the sample script to run:

DECLARE @id INT = 1; //change this to 1 or 2
;WITH LookupCTE AS
(
    SELECT d.*
    FROM ETL_LookupTable l INNER JOIN ETL_DataSet d ON l.LookupId = d.LookupId
    WHERE d.Id = @id --comment this out to get all data in ETL_DataSet
)
SELECT *
FROM ETL_LookupTable l LEFT JOIN LookupCTE cte ON l.LookupId = cte.LookupId

The end goal is to include all LookupTable values for each Id even if the set of Ids do not have the LookupTable values. Changing @Id between 1 and 2 you can see that this works but if you comment out the WHERE clause in the CTE, you get incorrect data. I am trying to do this without using cursors or any other row-by-agonizing-row technique so hopefully someone can assist or point me in the right direction.

Desired results should look like the following:

Id          LookupId    LookupCode LookupDesc
----------- ----------- ---------- ----------------------------------------
1           2           A          A Description
1           3           B          B Description
1           3           B          B Description
1           NULL        C          C Description
1           5           D          D Description
1           NULL        E          E Description
1           NULL        X          X Description
1           NULL        Z          Z Description
2           2           A          A Description
2           3           B          B Description
2           4           C          C Description
2           NULL        D          D Description
2           6           E          E Description
2           NULL        X          X Description
2           NULL        Z          Z Description

Upvotes: 0

Views: 89

Answers (4)

Ajay
Ajay

Reputation: 774

CREATE TABLE #ETL_DataSet (
    Id INT NOT NULL,
    LookupId INT NULL,
    LookupCode CHAR(1) NULL,
    LookupDesc NVARCHAR(40) NULL
);

CREATE TABLE #ETL_LookupTable (
    LookupId INT NOT NULL
    , LookupCode CHAR(1) NOT NULL
    , LookupDesc NVARCHAR(40) NOT NULL
    , CONSTRAINT PK_LookupTable_LookupId PRIMARY KEY CLUSTERED (LookupId)
);

-- Insert sampel data
INSERT INTO #ETL_LookupTable (LookupId, LookupCode, LookupDesc) VALUES
    (1, 'Z', 'Z Description'),
    (2, 'A', 'A Description'),
    (3, 'B', 'B Description'),
    (4, 'C', 'C Description'),
    (5, 'D', 'D Description'),
    (6, 'E', 'E Description'),
    (7, 'X', 'X Description');

INSERT INTO #ETL_DataSet (Id, LookupId, LookupCode, LookupDesc) VALUES 
    (1, 3, 'B', 'B Description'),
    (1, 5, 'D', 'D Description'),
    (1, 3, 'B', 'B Description'),
    (1, 2, 'A', 'A Description'),
    (2, 4, 'C', 'C Description'),
    (2, 6, 'E', 'E Description'),
    (2, 3, 'B', 'B Description'),
    (2, 2, 'A', 'A Description');


select * from #ETL_DataSet
select * from #ETL_LookupTable

SELECT E.ID,D.LOOKUPID,E1.LOOKUPCODE,E1.LOOKUPDESC 
FROM(
SELECT DISTINCT ID 
FROM #ETL_DATASET) AS E 
CROSS JOIN (SELECT * FROM #ETL_LOOKUPTABLE) AS E1 
LEFT JOIN #ETL_DATASET D ON D.LOOKUPID = E1.LOOKUPID AND D.ID = E.ID
ORDER BY E.ID,E1.LOOKUPCODE,E1.LOOKUPDESC 

OUTPUT

+----+----------+------------+---------------+
| ID | LOOKUPID | LOOKUPCODE |  LOOKUPDESC   |
+----+----------+------------+---------------+
|  1 | 2        | A          | A Description |
|  1 | 3        | B          | B Description |
|  1 | 3        | B          | B Description |
|  1 | NULL     | C          | C Description |
|  1 | 5        | D          | D Description |
|  1 | NULL     | E          | E Description |
|  1 | NULL     | X          | X Description |
|  1 | NULL     | Z          | Z Description |
|  2 | 2        | A          | A Description |
|  2 | 3        | B          | B Description |
|  2 | 4        | C          | C Description |
|  2 | NULL     | D          | D Description |
|  2 | 6        | E          | E Description |
|  2 | NULL     | X          | X Description |
|  2 | NULL     | Z          | Z Description |
+----+----------+------------+---------------+

Upvotes: 2

Paul Maxwell
Paul Maxwell

Reputation: 35583

You seem to want a Cartesian product of the etl_dataset.id with the etl_lookupTable information, so a cross join forming that product should work.

select
ed.*, el.*, d.*
from (select distinct id from etl_dataset) ed
cross join (select distinct LookupId, LookupCode, LookupDesc from ETL_LookupTable) el
left join etl_dataset d on ed.id = d.id and el.LookupId = d.LookupId
order by 
ed.id, el.LookupCode
GO
id | LookupId | LookupCode | LookupDesc    |   Id | LookupId | LookupCode | LookupDesc   
-: | -------: | :--------- | :------------ | ---: | -------: | :--------- | :------------
 1 |        2 | A          | A Description |    1 |        2 | A          | A Description
 1 |        3 | B          | B Description |    1 |        3 | B          | B Description
 1 |        3 | B          | B Description |    1 |        3 | B          | B Description
 1 |        4 | C          | C Description | null |     null | null       | null         
 1 |        5 | D          | D Description |    1 |        5 | D          | D Description
 1 |        6 | E          | E Description | null |     null | null       | null         
 1 |        7 | X          | X Description | null |     null | null       | null         
 1 |        1 | Z          | Z Description | null |     null | null       | null         
 2 |        2 | A          | A Description |    2 |        2 | A          | A Description
 2 |        3 | B          | B Description |    2 |        3 | B          | B Description
 2 |        4 | C          | C Description |    2 |        4 | C          | C Description
 2 |        5 | D          | D Description | null |     null | null       | null         
 2 |        6 | E          | E Description |    2 |        6 | E          | E Description
 2 |        7 | X          | X Description | null |     null | null       | null         
 2 |        1 | Z          | Z Description | null |     null | null       | null         

dbfiddle here

Upvotes: 2

Nolan Shang
Nolan Shang

Reputation: 2328

try this:

DECLARE @id INT = 2; --//change this to 1 or 2
;WITH IdList AS(
   SELECT DISTINCT id FROM ETL_DataSet
), LookupCTE AS
(
    SELECT IdList.id,d.LookupId, d.LookupCode, d.LookupDesc
    FROM ETL_LookupTable l INNER JOIN IdList ON 1=1
    LEFT  JOIN ETL_DataSet d ON l.LookupId = d.LookupId AND d.id=IdList.id
   -- WHERE d.Id = @id --comment this out to get all data in ETL_DataSet
)

SELECT * FROM LookupCTE --WHERE id=@id
1   NULL    NULL    NULL
1   2   A   A Description
1   3   B   B Description
1   3   B   B Description
1   NULL    NULL    NULL
1   5   D   D Description
1   NULL    NULL    NULL
1   NULL    NULL    NULL
2   NULL    NULL    NULL
2   2   A   A Description
2   3   B   B Description
2   4   C   C Description
2   NULL    NULL    NULL
2   6   E   E Description
2   NULL    NULL    NULL

Upvotes: 1

Valeria Elias
Valeria Elias

Reputation: 51

I think for the results you are looking for you should set ETL_DataSet as the left table. something like:

SELECT * FROM ETL_DataSet LEFT JOIN ETL_LookupTable ON ETL_DataSet.LookupId = ETL_LookupTable.LookupId 

Ref: https://www.w3schools.com/sql/sql_join_left.asp

Note: The LEFT JOIN keyword returns all records from the left table ... , even if there are no matches in the right table.

Upvotes: 0

Related Questions