Peter de Bruijn
Peter de Bruijn

Reputation: 822

DISTINCT within count SQL Server

I am writing a product search module that uses faceted search. To simplify things a bit there are two entities: product and price information. Since products can be sold by multiple web shops one product can have more price information records.

product

- product_guid
- product_category
- product_brand

priceInformation

- priceInformation_guid
- priceInformation_price
- priceInformation_product_guid

I list all products in a specific category and the user can filter by price range. The price ranges are listed with checkboxes, behind each price range you can see the number of products that match the search criteria. Price ranges that have no matches are not shown. For each price range I do a query to determine the number of matches (facet count).

select count(distinct(product.product_guid)) as count
from product 
  INNER JOIN priceInformation ON product.product_guid =
    priceInformation.priceInformation_product_guid
WHERE category= 'beer' 
  AND product.priceInformation_price > 20
  AND product.priceInformation_price <= 30

Everything works fine but... because the number of ranges is growing (this is just a simplified example) I perform hundreds of queries every search. This is a big performance killer. I tried to get all the facet counts in one query but since there is a 1:n relation between product and priceInformation the result is the number of prices that match, not the number of products.

select (Count(Case WHEN product.priceInformation_price > 20 AND
                        product.priceInformation_price <= 30 THEN 1
                   ELSE NULL END)) as Range2030,
       (Count(Case WHEN product.priceInformation_price > 30 AND
                        product.priceInformation_price <= 40 THEN 1
                   ELSE NULL END)) as Range3040
from product
  INNER JOIN priceInformation ON product.product_guid =
    priceInformation.priceInformation_product_guid
WHERE category= 'beer'

Bottom line is that I miss the DISTINCT here. I'm pulling my hairs out for days. Can anyone help?

Upvotes: 3

Views: 155

Answers (2)

valex
valex

Reputation: 24144

I think you should create a table with price ranges for instance:

create table PriceRanges(minPrice money,maxPrice money);
insert PriceRanges values (0,9),(10,19),(20,29),(100,149) ; 

Then using this table the query is:

SELECT minPrice,maxPrice,COUNT(DISTINCT p.product_guid)
   as ProductCount
FROM PriceRanges pr
LEFT JOIN priceInformation  pi ON pi.priceInformation_price 
           BETWEEN pr.minPrice AND pr.maxPrice
LEFT JOIN product p ON pi.priceInformation_product_guid= p.product_guid
WHERE p.product_category = 'beer'
GROUP BY minPrice,maxPrice
ORDER BY minPrice

Sqlfiddle demo

Or use JOIN instead of LEFT JOIN if you don't need empty price ranges.

Upvotes: 0

jarlh
jarlh

Reputation: 44795

Use a derived table to get different price ranges, do a GROUP BY to count distinct. Something like:

select count(distinct(dt.product_guid)), dt.pricerange as count
from
(
 select product.*, case when price < 20 then 'price < 20'
                        when price between 20 and 30 then 'price 20 - 30'
                        else 'price > 30' end as pricerange
 from product
     INNER JOIN priceInformation ON product.product_guid =
    priceInformation.priceInformation_product_guid
 WHERE category= 'beer'
) dt
group by dt.pricerange

Or did I answer something else here now...?

Upvotes: 1

Related Questions