Chris Doggett
Chris Doggett

Reputation: 20757

Need a MySQL JOIN to return all categories a product is in, per product, even if only one category is searched for

I have the following tables defined:

CREATE TABLE products (
    product_id INTEGER(11) NOT NULL AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    section VARCHAR(255) DEFAULT NULL,
    PRIMARY KEY (product_id)
) ENGINE=MyISAM;

CREATE TABLE categories (
    category_id INTEGER(11) NOT NULL AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    PRIMARY KEY (category_id)
) ENGINE=MyISAM;

CREATE TABLE product_categories (
    product_id INTEGER(11) NOT NULL,
    category_id INTEGER(11) NOT NULL,
    PRIMARY KEY (product_id, category_id)
) ENGINE=MyISAM;

There are actually many more, and this is part of optimizing much larger, more complex queries. Part of it is moving some slow subqueries into views, which have helped a lot so far.

Queries became very slow when I added the categories/product_categories tables and joined against them when allowing users to search by products.section or categories.category_id. The UI passes those in as search parameters, and I'm trying to get a row per product with its id, name, section, and a comma-separated list of category names associated with it. I was able to make that much faster with the following view and query:

CREATE OR REPLACE
ALGORITHM = MERGE
VIEW view_products_with_categories
AS
SELECT 
    p.product_id,
    p.name,
    p.section,
    pc.name AS category
products p
LEFT JOIN product_categories pc on p.product_id = pc.product_id
LEFT JOIN categories c ON pc.category_id = c.category_id;

SELECT
    product_id,
    name,
    section,
    GROUP_CONCAT(DISTINCT category ORDER BY category) AS categories
FROM view_products_with_categories
GROUP BY product_id;

Let's say we have the following rows:

product_id  name               section          category_id category
332913      Model Train Engine child-and-baby   1160        child-and-baby>baby-and-pre-schooltoys>playsets
332913      Model Train Engine child-and-baby   1308        toys>baby-and-preschool>playsets
332913      Model Train Engine child-and-baby   1312        toys>carstrains-and-planes>cars-and-vehicles

The simple query above gives me the following:

product_id  name                section         categories
332913      Model Train Engine  child-and-baby  child-and-baby>baby-and-pre-schooltoys>playsets,toys>baby-and-preschool>playsets,toys>carstrains-and-planes>cars-and-vehicles

That's fine, and as expected. However, I'd like the user to be able to search by category_id. Currently, our UI does some auto-complete magic on category names and adds a filter to the dynamically-generated SQL with the category_id in it. Had I left the category_id in the GROUP_CONCAT query, it'd be 1160. Let's say they wanted to search for the second one (1308), so we modify the query like so:

SELECT
    product_id,
    name,
    section,
    GROUP_CONCAT(DISTINCT category ORDER BY category) AS categories
FROM view_products_with_categories
WHERE category_id = 1308
GROUP BY product_id;

Now we get the following:

product_id  name                section         categories
332913      Model Train Engine  child-and-baby  toys>baby-and-preschool>playsets

Again, exactly what you'd expect. However, the customer wants to see all categories associated with the product that has one or more of the categories they're searching for. So, let's make some simplified sample data to show you what I'm looking for:

product_id  name       section     category_id  category
1           product_1  section_1   1            category_1
1           product_1  section_1   2            category_2 
1           product_1  section_1   3            category_3
2           product_2  section_2   3            category_3
2           product_2  section_2   4            category_4 
2           product_2  section_2   5            category_5

If the user searches for category_id = 3, I'd like them to get the following:

product_id  name      section   categories
1           product_1 section_1 category_1, category_2, category_3 
2           product_2 section_2 category_3, category_4, category_5 

But I'm currently only getting:

product_id  name      section   categories
1           product_1 section_1 category_3 
2           product_2 section_2 category_3

I just can't figure out a way to do it without a subquery, whose slowness is the reason I'm moving to views in the first place. I'm hoping there's something blindingly obvious I'm missing, probably due to sleep deprivation, so any help would be appreciated.

UPDATE: I also should mention that it's possible that a product is not in any categories, hence the LEFT JOINs in my code.

Upvotes: 3

Views: 2128

Answers (1)

dan b
dan b

Reputation: 1172

The following query works: (although I am not using the view)

select pc1.product_id, products.name, products.section,  
  group_concat(categories.name)
from products
inner join product_categories pc1 
on (pc1.product_id =    products.product_id and pc1.category_id =  3)
inner join product_categories pc2 
on (pc2.product_id = products.product_id)
inner join categories  
on (categories.category_id = pc2.category_id)
group by pc1.product_id, products.name, products.section

If you want to use the view the following will work:

SELECT  vpc.product_id,vpc.name,vpc.section,
GROUP_CONCAT(DISTINCT category ORDER BY category) AS categories
FROM view_products_with_categories vpc
inner join product_categories pc 
on (pc.product_id = vpc.product_id and pc.category_id=3) 
GROUP BY vpc.product_id, vpc.name, vpc.section;

Upvotes: 1

Related Questions