Reputation: 44
I have a product in different categories, 1 product may be in more than 1 categories. I want to get "which category has the most unique products"
Sample data:
Table: Category
Category_ID || NAME
1 || clothes
2 || bags
3 || shirts
Tabale: category_joins {product + category ids only}
Product_ID || Category_ID
1 || 1
5 || 1
1 || 3
2 || 3
3 || 3
Count Which category has the most unique products?:
Result
Count || Category_id
1 || 1
3 || 3
{Exclude Product_ID "1" from Category_ID "1", because it is in category "3",
and only count Product_ID 1 in Category 3 because Category 3 has most unique products}
Product id "1" in two categories
I want a total number of products in categories, but if product count in 1 category, do not count it in another category.
Help me to write its QUERY.
Upvotes: 0
Views: 742
Reputation: 22811
Proirity of the category is based on the number of products, if equal on Category_ID
. For each product take only top priority category.
select Product_ID, Category_ID
from (
select c1.Product_ID, c1.Category_ID, row_number() over (partition by Product_ID order by rnk) priority
from category_joins c1
join (
select Category_ID, row_number() over (order by count(distinct Product_ID) desc, Category_ID) rnk
from category_joins
group by Category_ID
) cr on cr.Category_ID = c1.Category_ID
) t
where priority = 1
To get count of the uniquely distributed products by category
select Category_ID, count(*) n
from (
select c1.Product_ID, c1.Category_ID, row_number() over (partition by Product_ID order by rnk) priority
from category_joins c1
join (
select Category_ID, row_number() over (order by count(distinct Product_ID) desc, Category_ID) rnk
from category_joins
group by Category_ID
) cr on cr.Category_ID = c1.Category_ID
) t
where priority = 1
group by Category_ID;
Upvotes: 2