rekam
rekam

Reputation: 1121

SQL "where IN" query in a many to many relation of 2 tables

I maybe ask a relatively simple question. But I cannot find a solution to this. It's a matter of two tables MANY TO MANY, so there's a third table between them. The schema below:

CREATE TABLE `options` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(200) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

INSERT INTO `options` (`id`, `name`) VALUES
(1, 'something'),
(2, 'thing'),
(3, 'some option'),
(4, 'other thing'),
(5, 'vacuity'),
(6, 'etc');

CREATE TABLE `person` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(200) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

INSERT INTO `person` (`id`, `name`) VALUES
(1, 'ROBERT'),
(2, 'BOB'),
(3, 'FRANK'),
(4, 'JOHN'),
(5, 'PAULINE'),
(6, 'VERENA'),
(7, 'MARCEL'),
(8, 'PAULO'),
(9, 'SCHRODINGER');

CREATE TABLE `person_option_link` (
  `person_id` int(11) NOT NULL,
  `option_id` int(11) NOT NULL,
  UNIQUE KEY `person_id` (`person_id`,`option_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


INSERT INTO `person_option_link` (`person_id`, `option_id`) VALUES
(1, 1),
(2, 1),
(2, 2),
(3, 2),
(3, 3),
(3, 4),
(3, 5),
(4, 1),
(4, 3),
(4, 6),
(5, 3),
(5, 4),
(5, 5),
(6, 1),
(7, 2),
(8, 3),
(9, 4)
(5, 6);

The idea is as follow: I would like to retrieve all people who have a link to option_id=1 AND option_id=3.

The expected result should be one person: John.

But I tried with something like that, which doesn't work because it returns also people who have 1 OR 3:

SELECT * 
FROM person p
LEFT JOIN person_option_link l ON p.id = l.person_id
WHERE l.option_id IN ( 1, 3 ) 

What is the best practice in this case?

//////// POST EDITED: I need to focus on an other important point //////// And what if we add a new condition with NOT IN? like:

SELECT * 
FROM person p
LEFT JOIN person_option_link l ON p.id = l.person_id
WHERE l.option_id IN ( 3, 4 ) 
AND l.option_id NOT IN ( 6 )

In this case, the result should be FRANK, because PAULINE who has also 3 and 4, have the option 6 and we don't want that.

Thanks!

Upvotes: 1

Views: 210

Answers (4)

chabzjo
chabzjo

Reputation: 626

For the "has not these ids" part of the question, simply add a WHERE clause:

WHERE person_id NOT IN 
(
SELECT person_id
FROM person_option_link
WHERE option_id = 4
)

Upvotes: 0

Jonathan Leffler
Jonathan Leffler

Reputation: 754470

It may not be the best option, but you could use a 'double join' to the person_option_link table:

SELECT * 
  FROM person AS p
  JOIN person_option_link AS l1 ON p.id = l1.person_id AND l1.option_id = 1
  JOIN person_option_link AS l2 ON p.id = l2.person_id AND l2.option_id = 3

This ensures that there is simultaneously a row with option ID of 1 and another with option ID of 3 for the given user.

The GROUP BY alternatives certainly work; they might well be quicker too (but you'd need to scrutinize query plans to be sure). The GROUP BY alternatives scale better to handle more values: for example, a list of the users with option IDs 2, 3, 5, 7, 11, 13, 17, 19 is fiddly with this variant but the GROUP BY variants work without structural changes to the query. You can also use the GROUP BY variants to select users with at least 4 of the 8 values which is substantially infeasible using this technique.

Using the GROUP BY does require a slight restatement (or rethinking) of the query, though, to:

  • How can I select people who have 2 of the option IDs in the set {1, 3}?
  • How can I select people who have 8 of the option IDs in the set {2, 3, 5, 7, 11, 13, 17, 19}?
  • How can I select people who have at least 4 of the option IDs in the set {2, 3, 5, 7, 11, 13, 17, 19}?

Upvotes: 0

sgeddes
sgeddes

Reputation: 62851

Use GROUP BY and COUNT:

SELECT p.id, p.name
FROM person p
LEFT JOIN person_option_link l ON p.id = l.person_id
WHERE l.option_id IN ( 1, 3 ) 
GROUP BY p.id, p.name
HAVING COUNT(Distinct l.option_id) = 2

I prefer using COUNT DISTINCT in case you could have the same option id multiple times.

Good luck.

Upvotes: 2

John Woo
John Woo

Reputation: 263803

This is a Relational Division Problem.

SELECT p.id, p.name
FROM   person p
       INNER  JOIN person_option_link l 
          ON p.id = l.person_id
WHERE  l.option_id IN ( 1, 3 ) 
GROUP  BY p.id, p.name
HAVING COUNT(*) = 2

if a unique constraint was not enforce on option_id for every id, a DISTINCT keyword is required to filter unique option_ID

SELECT p.id, p.name
FROM   person p
       INNER  JOIN person_option_link l 
          ON p.id = l.person_id
WHERE  l.option_id IN ( 1, 3 ) 
GROUP  BY p.id, p.name
HAVING COUNT(DISTINCT l.option_id) = 2

Upvotes: 2

Related Questions