Skynet
Skynet

Reputation: 578

SQL: get count of articles in two categories

In my MySQL db I have a table storing the association of articles and categories, columns are category_id and product_id. So the article #5 belonging to categories #3 and #6 has the rows (3,5)(6,5).

Now I need to get the count of articles which are both in category 1 and 6. Getting the count of articles in one category is easy with SELECT count(category_id) from s_a_c where category_id=1 but how do I expand this Query to check for two cats?

Upvotes: 2

Views: 1292

Answers (4)

lc.
lc.

Reputation: 116458

Here's one way with two joins (one join for each category you want to search for):

SELECT COUNT(1)
FROM articles
INNER JOIN s_a_c c1 ON articles.product_id = c1.product_id
    AND c1.category_id = 1
INNER JOIN s_a_c c2 ON articles.product_id = c2.product_id
    AND c2.category_id = 6

And here's another way with a HAVING clause. The derived table pulls all products from s_a_c with category 1 and 6, having both (COUNT(1) = 2). This takes advantage of the fact that {product_id, category_id} will be unique:

SELECT COUNT(1)
FROM
(
    SELECT product_id
    FROM s_a_c
    WHERE category_id IN (1,6)
    GROUP BY product_id
    HAVING COUNT(1) = 2
) x

Upvotes: 2

eggyal
eggyal

Reputation: 125845

You need to group your table by product, then filter the groups for those that match your desired criteria (using the HAVING clause, which is evaluated after grouping whereas WHERE clauses are evaluated before grouping):

SELECT COUNT(*) FROM (
  SELECT   article_id
  FROM     s_a_c
  WHERE    category_id IN (1,6)
  GROUP BY product_id
  HAVING   COUNT(DISTINCT category_id) = 2
) t

If (product_id,category_id) are guaranteed to be unique (e.g. through a uniqueness constraint enforced by a UNIQUE key), you can use the more performant COUNT(*) in place of COUNT(DISTINCT category_id).

If you need to implement more complex logic on the group filter, you can take advantage of MySQL's lack of genuine boolean types thus:

SELECT   article_id
FROM     s_a_c
WHERE    category_id IN (1,3,6)
GROUP BY product_id
HAVING   SUM(category_id = 1)
     AND SUM(category_id = 6)
     AND NOT SUM(category_id = 3)

Note that I continue to include a WHERE clause in order that MySQL can use an index to avoid a full table scan if possible.

Upvotes: 1

Michael Marr
Michael Marr

Reputation: 2099

EDIT: Misread - this original query does if exists in either or category:

SELECT COUNT(*) FROM s_a_c WHERE category_id IN (1,6);

The correct query for existing in BOTH categories:

SELECT COUNT(*) FROM s_a_c WHERE category_id = 1 AND category_id 6;

TBH, this all really premature w/o seeing the schema for s_a_c.

Upvotes: -1

Martin Sheeks
Martin Sheeks

Reputation: 137

To get all articles (may have duplicates) in two categories (one and two)

SELECT count(category_id) from s_a_c where category_id=1 or category_id=2

To get all articles (without duplicates) in two categories (one and two)

SELECT count(category_id) FROM s_a_c WHERE category_id=1 or category_id=2 GROUP BY article_id

Upvotes: -1

Related Questions