Nick Developer
Nick Developer

Reputation: 287

SQL Server query where distinct won't work - to select only unique records - and looking only at one column for uniqueness

I have this query where I am fetching records from a SQL Server database table that stores all of the products that users on our site has looked at (ProductRecentlyViewed table).

I am trying to select only the records that have unique rv.productvariantid (I want it to return only one of each product variant that the user has looked at).

I have tried using the keyword DISTINCT but since I also need the DateCreated for the orderby and since this is different among the records with the same rv.productvariantid, I cannot use DISTINCT.

What can I do instead to get the 20 newest unique product variants that the user has looked at? Here is my current SQL query which selects all the product variants that the user has looked at - and unfortunately it shows them more than once if the user did look at them more than once.

SELECT TOP 20
    rv.productId,
    p.productNameNO as productName,
    picid,
    picurl,
    rv.productRecentlyViewedId,
    rv.dateCreated,
    rv.lang,
    rv.isUserLoggedIn,
    rv.userId,
    rv.productVariantId
FROM 
    ProductRecentlyViewed rv
INNER JOIN 
    Product AS p ON rv.productid = p.productid
LEFT JOIN 
    (SELECT 
         productid,
         picurl,
         picid,
         ROW_NUMBER() OVER (PARTITION BY productid ORDER BY isfrontpic DESC) rn
     FROM 
         productpic) c ON c.rn = 1
                       AND rv.productId = c.productId
WHERE 
    rv.lang = 'NO'
    AND rv.cookieId = CONVERT(uniqueidentifier, '1f102c74-278b-430e-8129-1261dfc7e2ac')
ORDER BY 
    rv.dateCreated

Upvotes: 0

Views: 78

Answers (2)

Barbaros Özhan
Barbaros Özhan

Reputation: 65218

First, select all records within the subquery, and then filter by rn=1

SELECT TOP 20 q.*
  FROM (SELECT rv.productId,
               p.productNameNO as productName,
               picid,
               picurl,
               rv.productRecentlyViewedId,
               rv.dateCreated,
               rv.lang,
               rv.isUserLoggedIn,
               rv.userId,
               rv.productVariantId,
               pv.productVariantName,
               ROW_NUMBER() OVER(PARTITION BY rv.productid ORDER BY c.isfrontpic DESC) rn 
          FROM ProductRecentlyViewed rv
         INNER JOIN Product as p
            ON rv.productid = p.productid
          LEFT JOIN productpic c
            ON rv.productId = c.productId
          LEFT JOIN ProductVariant pv
            ON productVariantId = rv.productVariantId 
         WHERE rv.lang = 'NO'
           AND rv.cookieId = CONVERT(uniqueidentifier, '123')) q
 WHERE q.rn = 1
 ORDER BY q.dateCreated

The grouping is performed by PARTITION BY productid, and get only one productid from each group by rn=1

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269703

I would recommend doing the filtering like this:

SELECT rv.productId, p.productNameNO as productName,
       pp.picid, pp.picurl,
       rv.productRecentlyViewedId, rv.dateCreated, rv.lang,
       rv.isUserLoggedIn, rv.userId, rv.productVariantId
FROM (SELECT TOP (20) rv.*
      FROM (SELECT rv.*,
                   ROW_NUMBER() OVER (PARTITION BY rv.cookieId, rv.productId ORDER BY rv.dateCreated DESC) as seqnum
            FROM ProductRecentlyViewed rv
            WHERE rv.lang = 'NO' AND
                  rv.cookieId = CONVERT(uniqueidentifier, '1f102c74-278b-430e-8129-1261dfc7e2ac')
           ) rv
      WHERE seqnum = 1
     ) rv JOIN 
    Product p
    ON rv.productid = p.productid OUTER APPLY
    (SELECT TOP (1) pp.*
     FROM productpic pp
     WHERE pp.productid = p.productid
     ORDER BY pp.isfrontpic DESC
    ) pp
ORDER BY rv.dateCreated;

This gives the optimizer more options for making the query faster. In particular, it does not multiply all the rows out and then determine the first picture for each row.

Instead, it finds the 20 rows and then gets the picture for each one.

Upvotes: 1

Related Questions