Reputation: 4650
Anyone can explain behind this logic in MySQL IN clause and help me understand this issue
I have a user table and this table users are belongs to one or many groups. The group table primary key reference is updated in users table by comma(,) separated values as follows
Query 1. SELECT * FROM user;
+---------+-----------+-------------------------+-----------+
| user_id | user_name | user_email | group_id |
+---------+-----------+-------------------------+-----------+
| 1 | suresh | [email protected] | 22 |
| 2 | sundar | [email protected] | 2 |
| 3 | tester | [email protected] | 1,2,3,4 |
| 4 | gail | [email protected] | 1,2,3,4,5 |
+---------+-----------+-------------------------+-----------+
If I use IN clause and group id value as 2 in MySQL I got only one result
Query 2. SELECT * FROM user WHERE group_id IN(2)
+---------+-----------+---------------------+----------+
| user_id | user_name | user_email | group_id |
+---------+-----------+---------------------+----------+
| 2 | sundar | [email protected] | 2 |
+---------+-----------+---------------------+----------+
If I use IN clause and group id value as (1,2) in MySQL I got three results
Query 3. SELECT * FROM user WHERE group_id IN(1,2)
+---------+-----------+-------------------------+-----------+
| user_id | user_name | user_email | group_id |
+---------+-----------+-------------------------+-----------+
| 2 | sundar | [email protected] | 2 |
| 3 | tester | [email protected] | 1,2,3,4 |
| 4 | gail | [email protected] | 1,2,3,4,5 |
+---------+-----------+-------------------------+-----------+
I want to get group id 2 users like following output but it is not working as expected
If I use this query I need to get query 3 results is it possible?
SELECT * FROM user WHERE group_id IN(2)
Upvotes: 1
Views: 173
Reputation: 44363
What is the logic of the query SELECT * FROM user WHERE group_id IN(1,2);
?
(1,2)
groud_id
was being compare numericallyWhat I am about to present to you may seem rather unorthodox but please follow me...
Here is the query that will get every row that has both 1 and 2 in group_ids:
SELECT user.* FROM
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',2,',group_ids)) U1
INNER JOIN
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',4,',group_ids)) U2
ON U1.id = U2.id
INNER JOIN user ON user.id = U2.id;
Here is the code create our sample data
DROP DATABASE IF EXISTS sundar;
CREATE DATABASE sundar;
use sundar
CREATE TABLE user
(
id int not null auto_increment,
user_name VARCHAR(30),
user_email VARCHAR(70),
group_id VARCHAR(128),
PRIMARY KEY (id)
);
INSERT INTO user (user_name,user_email,group_id) VALUES
('suresh' , '[email protected]' ,'22'),
('sundar' , '[email protected]' ,'2'),
('tester' , '[email protected]' ,'1,2,3,4'),
('gail' , '[email protected]' ,'1,2,3,4,5');
SELECT * FROM user;
Let's create your sample
mysql> DROP DATABASE IF EXISTS sundar;
Query OK, 1 row affected (0.00 sec)
mysql> CREATE DATABASE sundar;
Query OK, 1 row affected (0.00 sec)
mysql> use sundar
Database changed
mysql> CREATE TABLE user
-> (
-> id int not null auto_increment,
-> user_name VARCHAR(30),
-> user_email VARCHAR(70),
-> group_id VARCHAR(128),
-> PRIMARY KEY (id)
-> );
Query OK, 0 rows affected (0.04 sec)
mysql> INSERT INTO user (user_name,user_email,group_id) VALUES
-> ('suresh' , '[email protected]' ,'22'),
-> ('sundar' , '[email protected]' ,'2'),
-> ('tester' , '[email protected]' ,'1,2,3,4'),
-> ('gail' , '[email protected]' ,'1,2,3,4,5');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql>
and here is what it looks like
mysql> SELECT * FROM user;
+----+-----------+-------------------------+-----------+
| id | user_name | user_email | group_id |
+----+-----------+-------------------------+-----------+
| 1 | suresh | [email protected] | 22 |
| 2 | sundar | [email protected] | 2 |
| 3 | tester | [email protected] | 1,2,3,4 |
| 4 | gail | [email protected] | 1,2,3,4,5 |
+----+-----------+-------------------------+-----------+
4 rows in set (0.00 sec)
mysql>
Again, here is the messy query that will get what you want:
SELECT user.* FROM
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',1,',group_ids)) U1
INNER JOIN
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',2,',group_ids)) U2
ON U1.id = U2.id
INNER JOIN user ON user.id = U2.id;
Here it is executed:
mysql> SELECT user.* FROM
-> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
-> FROM user) U WHERE LOCATE(',1,',group_ids)) U1
-> INNER JOIN
-> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
-> FROM user) U WHERE LOCATE(',2,',group_ids)) U2
-> ON U1.id = U2.id
-> INNER JOIN user ON user.id = U2.id;
+----+-----------+-------------------------+-----------+
| id | user_name | user_email | group_id |
+----+-----------+-------------------------+-----------+
| 3 | tester | [email protected] | 1,2,3,4 |
| 4 | gail | [email protected] | 1,2,3,4,5 |
+----+-----------+-------------------------+-----------+
2 rows in set (0.00 sec)
mysql>
OK, how about looking for (2,4)
?
mysql> SELECT user.* FROM
-> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
-> FROM user) U WHERE LOCATE(',2,',group_ids)) U1
-> INNER JOIN
-> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
-> FROM user) U WHERE LOCATE(',4,',group_ids)) U2
-> ON U1.id = U2.id
-> INNER JOIN user ON user.id = U2.id;
+----+-----------+-------------------------+-----------+
| id | user_name | user_email | group_id |
+----+-----------+-------------------------+-----------+
| 3 | tester | [email protected] | 1,2,3,4 |
| 4 | gail | [email protected] | 1,2,3,4,5 |
+----+-----------+-------------------------+-----------+
2 rows in set (0.00 sec)
mysql>
Looks like it works.
Give it a Try !!!
Upvotes: 1
Reputation: 247840
This is too long to be a comment, but you need to reconsider your current table design. You should not be storing the group_id
values as a comma separated list.
Your tables should be structured similar to the following:
create table user
(
user_id int, PK
user_name varchar(50),
user_email varchar(100)
);
create table groups
(
group_id int, PK
group_name varchar(10)
);
create table user_group
(
user_id int,
group_id int
);
The user_group
table will have a Primary Key of both the user_id
and the group_id
so you cannot get duplicates and then these columns should be foreign keys to the respective tables. This table will allow you to have multiple groups for each user_id.
Then when you query your tables, the query will be:
select u.user_id,
u.user_name,
u.user_email,
g.group_id
from user u
inner join user_group ug
on u.user_id = ug.user_id
inner join groups g
on ug.group_id = g.group_id
See SQL Fiddle with Demo.
If you needed to for display purposes show the group_id
values in a comma separated list, you can use GROUP_CONCAT()
:
select u.user_id,
u.user_name,
u.user_email,
group_concat(g.group_id order by g.group_id) group_id
from user u
inner join user_group ug
on u.user_id = ug.user_id
inner join groups g
on ug.group_id = g.group_id
group by u.user_id, u.user_name, u.user_email
If you redesign your tables, then when you search it becomes much easier:
select u.user_id,
u.user_name,
u.user_email,
g.group_id
from user u
inner join user_group ug
on u.user_id = ug.user_id
inner join groups g
on ug.group_id = g.group_id
where g.group_id in (1, 2)
Upvotes: 3
Reputation: 227310
MySQL doesn't treat comma separated lists as anything more than just a string. When you do WHERE group_id IN(2)
, it converts group_id
to an INT
, so it can compare it with 2
.
When casting to an INT
, MySQL stops at the first non-number character.
For example, '1,2,3,4,5' IN (2)
becomes 1 IN (2)
. Which is FALSE.
You can try to use FIND_IN_SET
to do what you want, but it's not very efficient (because it can't use indexes; it need to read every single row to see if it matches).
WHERE FIND_IN_SET(2, group_id)
To search for multiple rows, use OR
.
WHERE FIND_IN_SET(1, group_id) OR FIND_IN_SET(2, group_id)
The correct way to do this, is to create a "link table" that contains one (or more) rows for each user, showing what group(s) they are in.
Upvotes: 2
Reputation: 34063
When passing 1,2
to the IN
operator, you're asking for 1
and 2
; this is why it will return all three results. If you have a column with comma separated values, you're violating normal form; as each column should not contain more than one value. If you want to find a single value in a multi-valued comma separated column, then you can use FIND_IN_SET
.
A normalized schema would look like:
+---------+-----------+-------------------------+ | user_id | user_name | user_email | +---------+-----------+-------------------------+ | 2 | sundar | [email protected] | | 3 | tester | [email protected] | | 4 | gail | [email protected] | +---------+-----------+-------------------------+ +---------+-----------+ | user_id | group_id | +---------+-----------+ | 2 | 2 | | 3 | 1 | | 3 | 2 | | 3 | 3 | | 3 | 4 | | 4 | 1 | | 4 | 2 | | 4 | 3 | | 4 | 4 | | 4 | 5 | +---------+-----------+ +----------+ | group_id | +----------+ | 1 | | 2 | | 3 | | 4 | +----------+
Upvotes: 2