Rich Bradshaw
Rich Bradshaw

Reputation: 72965

Selecting other side of many to many relationship

Background

I have three tables: (SQL fiddle: http://sqlfiddle.com/#!2/f7b33/11)

products

+----+-----------+
| id |  product  |
+----+-----------+
|  1 | product_1 |
|  2 | product_2 |
|  3 | product_3 |
+----+-----------+

products_features

+------------+------------+
| product_id | feature_id |
+------------+------------+
|          1 |          1 |
|          1 |          2 |
|          1 |          3 |
|          1 |          4 |
|          2 |          1 |
|          2 |          3 |
|          3 |          4 |
+------------+------------+

features

+----+-----------+
| id |  feature  |
+----+-----------+
|  1 | feature_1 |
|  2 | feature_2 |
|  3 | feature_3 |
|  4 | feature_4 |
+----+-----------+

I'm then selecting like this:

SELECT products.product,
       GROUP_CONCAT(features.feature) AS features
FROM   products
       LEFT JOIN products_features
              ON product_id = products.id
       LEFT JOIN features
              ON products_features.feature_id = features.id
GROUP  BY products.id

to get something like:

+-----------+-----------------------------------------+
|  product  |                features                 |
+-----------+-----------------------------------------+
| product_1 | feature_1,feature_2,feature_3,feature_4 |
| product_2 | feature_1,feature_3                     |
| product_3 | feature_4                               |
+-----------+-----------------------------------------+

Question

So, everything is great, However, what I'd like to do is to only have things that have feature_1 and feature_3, whilst still getting the other features.

In other words, I'd like to write a query that would get me:

+-----------+-----------------------------------------+
|  product  |                features                 |
+-----------+-----------------------------------------+
| product_1 | feature_1,feature_2,feature_3,feature_4 |
| product_2 | feature_1,feature_3                     |
+-----------+-----------------------------------------+

I've tried:

SELECT products.product,
       GROUP_CONCAT(features.feature) AS features
FROM   products
       LEFT JOIN products_features
              ON product_id = products.id
       RIGHT JOIN features
              ON products_features.feature_id = features.id AND features.feature in ('feature_1','feature_3')

GROUP  BY products.id

but of course I get:

+-----------+---------------------+
|  product  |      features       |
+-----------+---------------------+
| (null)    | feature_4,feature_2 |
| product_1 | feature_1,feature_3 |
| product_2 | feature_3,feature_1 |
+-----------+---------------------+

So though I now know product_1 and product_2 are the ones with those features, I can't see the rest of the features they have.

What query will give me allow me to specify feature_1 and feature_3 and get the following response?

+-----------+-----------------------------------------+
|  product  |                features                 |
+-----------+-----------------------------------------+
| product_1 | feature_1,feature_2,feature_3,feature_4 |
| product_2 | feature_1,feature_3                     |
+-----------+-----------------------------------------+

Upvotes: 3

Views: 34

Answers (2)

Mosty Mostacho
Mosty Mostacho

Reputation: 43434

Returning data from a group_concat and then splitting the results back in your front-end is a no go. Not only will it result in inefficient use of resources but also might result in errors in the splitting (eg: imagine what would happen when a feature appears and it happens to contain a ,).

The third and main reason not to use a group_concat is that it has a limit on the length. I'm not re-explaining the wheel but check this question for more information.

The best approach will be to return all the matching features for a given product and then just process them in a loop. It should be pretty simple to do (actually, most UI components, web or not, would expect to receive a collection to display them and this is what you're sending to them).

Additionally, it is most likely that you already have the ids of the features that you want to check so it would be more efficient to check for them instead of strings.

My last comment is that you don't actually need left joins in there. A left join will return all elements from the left regardless of whether they have a match in the right. However, you need the right side to have 2 elements (the 2 features) which makes the query contradictory. You just need an inner join in there.

This is the query I would use:

SELECT p.product, f.feature FROM products p
JOIN (
  SELECT product_id FROM products_features
  WHERE feature_id IN (1, 3)
  GROUP BY product_id
  HAVING count(*) = 2
) pf ON p.id = pf.product_id
JOIN products_features pf2 ON p.id = pf2.product_id
JOIN features f ON pf2.feature_id = f.id

This is the fiddle for that query.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269843

I think the most generalizable way to approach this is with a having clause:

SELECT products.product,
       GROUP_CONCAT(features.feature) AS features
FROM   products
       LEFT JOIN products_features
              ON product_id = products.id
       LEFT JOIN features
              ON products_features.feature_id = features.id
GROUP  BY products.id
HAVING sum(features.feature = 'feature_1') > 0 and
       sum(features.feature = 'feature_3') > 0;

Each clause in the having statement is counting the number of times that a given feature appears. The and is requiring that both features be in the final result set.

EDIT:

Given the structure of your statement, you could also do:

HAVING find_in_set('feature_1', features) > 0 and
       find_in_set('feature_3', features) > 0;

This works because you are producing a column with the list of features and you are using a comma as a separator for that list.

Upvotes: 3

Related Questions