thetipsyhacker
thetipsyhacker

Reputation: 1422

Duplicate Results After Multiple Joins

I've tried using the distinct keyword, and I've tried grouping the columns, but the query still returns duplicate results.

SELECT 
distinct orderformdump.itemno,
case when PRH.ACTIVE = 1 and PRH.STARTDATE < cast(GETDATE() as DATE) and (PRH.ENDDATE IS NULL OR PRH.ENDDATE >= cast(GETDATE() as DATE)) then CAST(1 AS BIT) ELSE CAST(0 AS BIT) END as isindeal
FROM ORDERFORMDUMP
INNER JOIN ICITEM ON ICITEM.FMTITEMNO = orderformdump.itemno
LEFT JOIN PRD2 on ICITEM.ITEMNO = PRD2.ITEMNO
LEFT JOIN PRH on PRD2.CODE = PRH.CODE
order by Itemno

The duplication is cause by either the join, LEFT JOIN PRD2 on ICITEM.ITEMNO = PRD2.ITEMNO, or the column case when PRH.ACTIVE = 1 and PRH.STARTDATE < cast(GETDATE() as DATE) and (PRH.ENDDATE IS NULL OR PRH.ENDDATE >= cast(GETDATE() as DATE)) then CAST(1 AS BIT) ELSE CAST(0 AS BIT) END as isindeal. I can remove the join and get the same results with a subquery, but the subquery causes the query to run very slowly (even when I index columns).

Why are there multiple results and what can I do to fix this?

Upvotes: 0

Views: 201

Answers (2)

Esperento57
Esperento57

Reputation: 17462

try this :

SELECT distinct orderformdump.itemno, isnull(tmp.isindeal, CAST(0 AS BIT) ) isindeal
FROM ORDERFORMDUMP
INNER JOIN ICITEM ON ICITEM.FMTITEMNO = orderformdump.itemno
outer apply
(
 select top 1 CAST(1 AS BIT) as isindeal
 from PRD2 INNER JOIN PRH on PRD2.CODE = PRH.CODE 
 where ICITEM.ITEMNO = PRD2.ITEMNO and PRH.ACTIVE = 1  
 and cast(GETDATE() as DATE) between PRH.STARTDATE and isnull(PRH.ENDDATE, cast(GETDATE() as DATE))
) tmp

Upvotes: 1

Shawn
Shawn

Reputation: 4786

To find your problem row, try:

SELECT itemNo, count(*) as dealCount
FROM (
    SELECT 
    distinct orderformdump.itemno,
    case when PRH.ACTIVE = 1 and PRH.STARTDATE < cast(GETDATE() as DATE) and (PRH.ENDDATE IS NULL OR PRH.ENDDATE >= cast(GETDATE() as DATE)) then CAST(1 AS BIT) ELSE CAST(0 AS BIT) END as isindeal
    FROM ORDERFORMDUMP
    INNER JOIN ICITEM ON ICITEM.FMTITEMNO = orderformdump.itemno
    LEFT JOIN PRD2 on ICITEM.ITEMNO = PRD2.ITEMNO
    LEFT JOIN PRH on PRD2.CODE = PRH.CODE
) t
GROUP BY itemNO
ORDER BY dealCount DESC

Anything with dealCount > 1 will indicate multiple rows in ICITEM, PRD1 or PRH. Then you should be able to figure out which you need to eliminate.

How many rows are in the tables that you're joining on? You can likely make it faster by changing up your JOINs a little bit. You're LEFT JOINing to a LEFT JOIN that you aren't really using all of. Since you only use PRH.ACTIVE, PRH.STARTDATE and PRH.ENDDATE, you can INNER JOIN PRH to PRD2 and reduce the result set that you are LEFT JOINing to ICITEM. Then, I'm guessing that your STARTDATE and ENDDATE are DATE datatype and not datetime (casting getDate() to DATE). If they are a datetime, you may want to account for the time component. For your isindeal, you don't need to cast 1 and 0 as bit. Just return them and deal with them in your code if you need to display a true or false.

Upvotes: 1

Related Questions