Dan
Dan

Reputation: 121

MySQL select from multiple tables with multiple where clauses and find_in_set

I'm having an issue trying to avoid partial duplicate results with a MySQL query. I admit I'm really new at MySQL but I have learned from research on SO that the schema I'm about to lay out for you below could definitely be done a better way (the linked_users column of the users table should be a separate table). However, I cannot change the way it is set up right now.

I'm trying to return the user names of the user ids assigned to each item in t2 or the user names of the users who are linked to those users. The query is returning two sets of names for each item, however. I think this is happening because it is searching them all twice, and I've attempted to read this tutorial on SO regarding returning multiple values from multiple tables but I can't seem to wrap my mind around the JOINS and UNIONS and whatnot.

My question is two-fold:

  1. What can I do to solve this issue without changing the way the database is set up?
  2. How should the database be changed to better allow for queries like this in the future?

Thank you for your time.

Schema:

create table users (user_id int, user_name varchar (55), linked_users varchar (55));
insert into users( user_id, user_name, linked_users)values(1, 'user1', '2,154,4,45');
insert into users( user_id, user_name, linked_users)values(2, 'user2', '13,1,200');
create table t2 (t2_id int, user_id int);
insert into t2( t2_id, user_id)values(1, 2);
insert into t2( t2_id, user_id)values(2, 1);
insert into t2( t2_id, user_id)values(3, 1);
insert into t2( t2_id, user_id)values(4, 2);
insert into t2( t2_id, user_id)values(5, 3);

Query:

SELECT t.*, u.user_name 
FROM t2 t, users u 
WHERE t.user_id = u.user_id
OR find_in_set(t.user_id, u.linked_users) > 0

Fiddle: http://sqlfiddle.com/#!9/c5540/10

Upvotes: 0

Views: 3212

Answers (3)

Dan
Dan

Reputation: 121

Thank you to Gordon and Tarik for your responses. I tried the code you supplied in my fiddle but was unable to pull the t2 information, so after messing around with it some more I found this query solved the issue I was having:

SELECT t2.*, u.user_name 
FROM users u, t2 t 
WHERE t.user_id = u.user_id 
AND (t.user_id = '$user_id' 
     OR t.user_id IN 
        (SELECT u2.user_id 
         FROM users u2 
         WHERE find_in_set('$user_id', u2.linked_users)
        )
    )

Again, I understand that this is not the optimal way to do it, but since I can't change the structure this will serve the purpose.

Upvotes: 1

Tarik
Tarik

Reputation: 11209

You have a many to many relationship within the same table. One user may have many linked users and a user may be the linked user of more that one user. To represent that you need an additional table that you may call linked_users, defined with two columns:

user_id int foreign key to user_id in users table
linked_user_id int foreign key to user_id in users table

The primary key of this table should be both user_id and linked_user_id

Now, let's say you have users 1 to 10 with 1 having 5, 6, 8 as linked users and 3 having 1, 6, 10 as linked users. You should insert

1, 5
1, 6
1, 8
3, 1
3, 6
3, 10

in linked_users. You query becomes:

select u1.*, u2.*
from users u1 inner join linked_users lu on u1.user_id = lu.user_id
inner join users u2 on lu.linked_user_id = u2.user_id

If you have to stick to your design

select u1.*, u2.*
from users u1 inner join users u2 on
find_in_set(u2.user_id, u1.linked_users) > 0

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270653

You should definitely change the structure of your tables. The correct structure would use two tables, one for the entity ("users") and the other for the relationships between them:

create table users (
    user_id int not null primary key auto_increment,
    user_name varchar(55)
);

create table userlinks (
    user_id int not null references users(user_id),
    linked_user_id int not null references users(user_id)
);

create table t2 ( t2_id int, user_id int not null references users(user_id) );

Your structure has several obvious flaws:

  • You are storing integer values as string representations of them.
  • You are storing lists in a string, and SQL has very little support for comma-delimited lists.
  • You have no definitions of primary or foreign keys.

With this structure, you can get the list of users for a given "t" value without duplicates and without a distinct by using exists:

select u.user_id
from users u cross join
     (select user_id from t2 where t2.tid = 1) t
where u.user_id = t.user_id or
      exists (select 1
              from ul
              where ul.user_id = t.user_id and
                    u.user_id = ul.linked_user_id
             );

The query says: "Get all the users where the user id matches the corresponding id in t2 or the user is linked to the user_id in t2."

As a note, you can also do this with your structure by doing:

select u.user_id
from users u cross join
     (select user_id from t2 where t2.tid = 1) t
where u.user_id = t.user_id or
      find_in_set(u.user_id,
                  (select u.linked_users from users u2 where u2.user_id = t.user_id)
                 ) > 0;

Upvotes: 1

Related Questions