kastulo
kastulo

Reputation: 861

MySql Distinct values of two different columns of the same table

I have 2 tables, Products, and Genre.. In the products table its all the info on the product including two genres, (genre_id and genre_id2), only the first one is required, so the second could have null values

in the Genre table I have all the possible genres names with an ID to relate to the genres ID's of the product table

TABLE PRODUCTS
--------------
id    product_name    genre_id      genre_id2
-------------------------------------------------
1     product             1             2
2     product2            2 
3     product3            1             4
4     product4            3             4

TABLE GENRE
-----------
id    genre_name
-------------------------------------------------
1     genre1
2     genre2
3     genre3
4     genre4

I want to select all the different genres and see how many products i have of that genre

something like this

RESULT
------
genre_id       count
-----------------------
    1            2
    2            2
    3            1
    4            2

I have this statement

SELECT DISTINCT p.genre_id AS genre, g.genre_name,  COUNT(p.genre_id) AS cnt
FROM products AS p
JOIN genre AS g
ON p.genre_id=g.id
GROUP BY genre_id
ORDER BY cnt DESC

but only works on genre_id and i dont know how to incorporate genre_id2 in this statement and add the counts that coincide with genre_id and list the different ones

Upvotes: 2

Views: 2594

Answers (3)

Ilmari Karonen
Ilmari Karonen

Reputation: 50368

Given that you have the genre table to join against, the "obvious" solution would be:

SELECT genre.id AS genre, COUNT(products.id) AS n
FROM genre
  LEFT JOIN products ON genre.id IN (genre_id, genre_id2)
GROUP BY genre.id

(SQLFiddle demo)

If you didn't, though, you could still do this using UNION:

SELECT genre, COUNT(*) as n
FROM
  (SELECT id, genre_id AS genre FROM products
   WHERE genre_id IS NOT NULL
   UNION
   SELECT id, genre_id2 AS genre FROM products
   WHERE genre_id2 IS NOT NULL) AS foo
GROUP BY genre

(SQLFiddle demo)


Edit: The UNION method won't (and can't) return any rows with a count of 0. The "obvious" method does, because I used a LEFT JOIN. If you don't want them, you can eliminate them from the "obvious" method too by replacing the LEFT JOIN with just a JOIN.


Edit 2: With appropriate indexes (one on each of genre_id and genre_id2), and depending on the size and content of the actual data set, the following solution using dependent subqueries might be more efficient than either of the above:

SELECT genre.id AS genre,
  (SELECT COUNT(*) FROM products WHERE genre.id = genre_id) +
  (SELECT COUNT(*) FROM products WHERE genre.id = genre_id2) AS n
FROM genre

To eliminate rows with zero counts, just stick

HAVING n > 0

at the end of the query. (SQLFiddle demo) This is actually a generic method for filtering out such rows.

I haven't benchmarked this against JW's solution to see which one is more efficient, since that would require some actual data. If your dataset is fairly small, it may not matter either way.

(The outcome will depend quite heavily on how well MySQL optimizes them: JW's nested LEFT JOINs could get very slow on large data sets if executed naïvely, but I'm not sure if MySQL is smart enough not to do that. Meanwhile, my dependent subqueries probably won't get optimized much, but even a naïve execution should be reasonably fast as long as the necessary indexes are there.)


Edit 3: Note that, in general, this problem arises from bad table design. It would be better to change the schema to use a junction table, as described e.g. in this answer.

This would allow you to have each product belong to any number of genres, and would let you easily count the products in each genre using a simple query like this:

SELECT genre.id AS genre, COUNT(products.id) AS n
FROM genre
  JOIN product_genre ON genre.id   = product_genre.genre
  JOIN products      ON product.id = product_genre.product
GROUP BY genre.id

Upvotes: 2

John Woo
John Woo

Reputation: 263883

SELECT  a.ID, COUNT(DISTINCT b.ID) + COUNT(DISTINCT c.ID)
FROM    Genre a
        LEFT JOIN products b
            ON a.ID = b.genre_id
        LEFT JOIN products c
            ON a.ID = c.genre_id2
GROUP BY a.ID

Caveat: This will not work if you have a record with the same genre_id and genre_ID2 ex

5     product5            1            1

Upvotes: 3

Cody
Cody

Reputation: 39

Write a similar statement with genre_id2 and do an union with the original result

Upvotes: 1

Related Questions