Reputation: 682
I am having trouble understanding how to join three tables together without creating duplicate records.
I have my first query which uses table profile, and tabe instruments:
SELECT
[p].[shopper_id]
, [pi].[instrument_id]
FROM
[dbo].[profile] [p]
INNER JOIN [dbo].[profile_instruments] [pi]
ON [pi].[PID] = [p].[PID]
WHERE
[p].[date_created] > DATEADD(yy, -2, GETDATE())
AND [p].[shopper_id] = '53D5444535434747A935E207C9EDD96A'
ORDER BY
[p].[shopper_id];
This query gives me the results:
shopper_id instrument_id
53D5444535434747A935E207C9EDD96A 35
53D5444535434747A935E207C9EDD96A 17
My second query uses table profile and table styles:
SELECT
[p].[shopper_id]
, [ps].[style_id]
FROM
[dbo].[profile] [p]
INNER JOIN [dbo].[profile_styles] [ps]
ON [ps].[PID] = [p].[PID]
WHERE
[p].[date_created] > DATEADD(yy, -2, GETDATE())
AND [p].[shopper_id] = '53D5444535434747A935E207C9EDD96A'
ORDER BY
[p].[shopper_id];
The results are:
shopper_id style_id
53D5444535434747A935E207C9EDD96A 845
53D5444535434747A935E207C9EDD96A 291
When I combine the 3 tables profile, instruments and styles:
SELECT
[p].[shopper_id]
, [pi].[instrument_id]
, [ps].[style_id]
FROM
[dbo].[profile] [p]
INNER JOIN [dbo].[profile_instruments] [pi]
ON [pi].[PID] = [p].[PID]
INNER JOIN [dbo].[profile_styles] [ps]
ON [ps].[PID] = [p].[PID]
WHERE
[p].[date_created] > DATEADD(yy, -2, GETDATE())
AND [p].[shopper_id] = '53D5444535434747A935E207C9EDD96A'
ORDER BY
[p].[shopper_id];
I get the results:
shopper_id instrument_id style_id
53D5444535434747A935E207C9EDD96A 35 845
53D5444535434747A935E207C9EDD96A 35 291
53D5444535434747A935E207C9EDD96A 17 845
53D5444535434747A935E207C9EDD96A 17 291
I am not using a group by yet because I am unsure how to apply it with the columns I have. I am also unsure if the duplicates are because of the type of join I am using, or because I am not using a group by.
At any rate, I would like to ask for some help to be able to figure out what modifications to my query I need to make in order to have an output that looks like:
shopper_id instrument_id style_id
53D5444535434747A935E207C9EDD96A 35 845
53D5444535434747A935E207C9EDD96A 17 291
Thank you in advance for your help.
Upvotes: 1
Views: 35
Reputation: 35323
Maybe...
We assign a row number to each style and instrument for each PID. Then not only do we join by PID, we join by Row_number which guarantees when a pID has 2 instruments and two styles, we still only get 2 records; instead of 4.
A full outer join is used as I don't know if you want to see situations where 2 instruments exists but only 1 style or 2 styles exist but only 1 instrument.
SELECT [p].[shopper_id]
, [pi].[instrument_id]
, [ps].[style_id]
FROM [dbo].[profile] [p]
INNER JOIN (SELECT A.*, row_number() over (partition by PID order by instrument_ID) RN
FROM [dbo].[profile_instruments] A) [pi]
ON [pi].[PID] = [p].[PID]
FULL OUTER JOIN (SELECT A.*, Row_number() over (partition by PID order by style_ID) RN
FROM [dbo].[profile_styles] A) [ps]
ON [ps].[PID] = [p].[PID]
AND [PI].[RN] = [PS].[RN]
WHERE [p].[date_created] > DATEADD(yy, -2, GETDATE())
AND [p].[shopper_id] = '53D5444535434747A935E207C9EDD96A'
ORDER BY [p].[shopper_id];
We may be able to use a coalesce on the PI.RN or PS.RN or both if you are willing to accept that when styles and instruments have different counts that matching to a single value in either table is acceptable
Example:
PID STYLE_ID PID Instrument_ID
1 A 1 Z
1 B
The above should return
1 A Z
1 B
But maybe you want
1 A Z
1 B Z
This may work if we change AND [PI].[RN] = [PS].[RN]
to
AND coalesce([PI].[RN],1) = coalesce([PS].[RN],1)
but testing needs to occur. As if one side is devoid of all records for a pid, you'd still get null on one table.
Upvotes: 2
Reputation: 11
Try making a natural join, would look similar to this:
SELECT
[p].[shopper_id],
[pi].[instrument_id],
[ps].[style_id]
FROM
[dbo].[profile] [p],
[dbo].[profile_instruments] [pi],
[dbo].[profile_styles] [ps]
WHERE
[p].[date_created] > DATEADD(yy, -2, GETDATE())
AND [p].[shopper_id] = '53D5444535434747A935E207C9EDD96A'
AND [pi].[PID] = [p].[PID]
AND [ps].[PID] = [p].[PID]
ORDER BY
[p].[shopper_id];
Upvotes: 0