Trevor Oakley
Trevor Oakley

Reputation: 448

SQL joins to avoid duplicates in two tables

I have two tables, one for images and one for parts. A part has an image, but the parts are duplicated and hence the images are duplicated. I want to download the images just one for a unique part number.

One PartNumber HAS many images (all the same at different locations)
One PartID HAS one image (unique imageid)
One ImageID HAS one partnumber

I have summarised the tables below:

Part Table: PartNumber, ImageID

Image Table: ImageID, ImageName, ImageSaved (true or false - 1,0)

An example data would be:

Parts: (P1,1), (P1, 2), (P1, 3), (P2, 4)
Images: (1, I1, 1), (2, I2, 0), (3, I3, 0), (4, I4, 0)

Hence, P1 has images I1..3 but they are actually the same image but stored at different locations. Hence once it is downloaded once (imagesaved = 1) the other records for that part may be ignored.

I have tried the following to extract all the images but I need to add something to stop the system downloading all the duplicate images (that is happening now).

select * from trahanimage ti1 inner join  
(
select top 10 ti.imageid, trp.manpartnumber, ti.imagename,  
ti.imageurl,       row_number() 
over(partition by trp.manpartnumber order by trp.manpartnumber) as rk
from trahanimage ti
inner join trahanretailerpart trp 
on trp.imageid = ti.imageid
and ti.imagetype = 2
and ti.imageSaved = 0
and trp.currentpart = 1
and trp.shopid = 10 ) as summary on summary.imageid = ti1.imageid
and summary.rk = 1

This works for the first time, but after the download the imagesaved to changed to 1 for one record (eg I1) and then when the next batch runs it needs to ignore the images joined for the part with an image already saved. I think a new join is needed after summary.rk. Logically speaking the logic is

and summary.rk =1 and ti1.imageid not in (select imageid from
trahanimage where imagesaved = 1 and manpartnumber = part number in
inner join).

Any help is welcome.

Upvotes: 0

Views: 56

Answers (2)

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48177

This return the parts with no images download yet.
If part have any saveimage then SUM will be <> 0

SQL FIDDLE DEMO

 SELECT p.PartNumber, SUM(I.ImageSaved) Download, MIN(p.ImageID) DownloadID
 FROM Parts p 
 INNER JOIN Image I
 ON   p.ImageID = I.ImageID
 GROUP BY p.PartNumber
 HAVING SUM(I.ImageSaved) = 0

Upvotes: 1

Trevor Oakley
Trevor Oakley

Reputation: 448

OK, I solved this now using EXISTS

where  not exists (select trp1.manpartnumber from trahanretailerpart trp1
 inner join trahanimage tix on tix.imageid = trp1.imageid and
tix.imagesaved = 1 and trp1.currentpart = 1 and tix.imagetype = 2 and
trp.shopid = 10)

Upvotes: 0

Related Questions