Dibbs
Dibbs

Reputation: 33

Issue with grouping?

I asked earlier about a solution to my problem which worked however now when I'm trying to get some information from a second table (that stores more information) I'm running into a few issues.

My tables are as follows

Users

+----+----------------------+---------------+------------------+
| id | username             | primary_group | secondary_groups |
+----+----------------------+---------------+------------------+
|  1 | Username1            |      3        |       7,10       |
|  2 | Username2            |      7        |      3,5,10      |
|  3 | LongUsername         |      1        |       3,7        |
|  4 | Username3            |      1        |       3,10       |
|  5 | Username4            |      7        |                  |
|  6 | Username5            |      5        |      3,7,10      |
|  7 | Username6            |      2        |        7         |
|  8 | Username7            |      4        |                  |
+----+----------------------+---------------+------------------+

Profile

+----+---------------+------------------+
| id |   facebook    |      steam       |
+----+---------------+------------------+
|  1 |  10049424151  |       11         |
|  2 |  10051277183  |       55         |
|  3 |  10051281183  |       751        |
|  4 |               |       735        |
|  5 |  10051215770  |      4444        |
|  6 |  10020210531  |      50415       |
|  7 |  10021056938  |     421501       |
|  8 |  10011547143  |       761        |
+----+---------------+------------------+

My SQL is as follows (based off the previous thread)

SELECT u.id, u.username, p.id, p.facebook, p.steam 
FROM users u, profile p 
WHERE p.id=u.id AND FIND_IN_SET(  '7', secondary_groups ) 
OR primary_group = 7 
GROUP BY u.id

The problem is my output is displayed as below

+----+----------------------+-------------+-------+
| id | username             |   facebook  | steam |
+----+----------------------+-------------+-------+
|  1 | Username1            | 10049424151 |  11   |
|  2 | Username2            | 10051277183 |  55   |
|  3 | LongUsername         | 10051281183 |  751  |
|  4 | Username4            | 10051215770 | 4444  |
|  5 | Username5            | 10049424151 |  11   |
|  6 | Username6            | 10049424151 |  55   |
+----+----------------------+-------------+-------+

Upvotes: 1

Views: 46

Answers (3)

alttag
alttag

Reputation: 1173

I think @SIDU has it in the comments: You are experiencing a Boolean order of operations problem. See also SQL Logic Operator Precedence: And and Or

For example:

SELECT 0 AND 0 OR 1 AS test;

+------+
| test |
+------+
|    1 |
+------+

When doing complex statements with both AND and OR, use parenthesis. The operator order problem is leading to you doing an unintended outer join that's being masked by your GROUP BY. You shouldn't need a GROUP BY for that statement.

Although I don't personally care for the style @spencer7593 suggests in his answer(using INNER JOIN, etc.), it does have the advantage of preventing or identifying errors early for people new to SQL, so it's something to consider.

Upvotes: 0

spencer7593
spencer7593

Reputation: 108420

I'm guessing that the problem is that profile rows with a primary_group of 7 are getting matched to all user rows. Remove the GROUP BY, and you'll be able to better see what is happening.

But that's just a guess. It's not clear what you are attempting to achieve.

I suspect you are getting tripped up with the order of precedence of the AND and OR. (The AND operator has a higher order of precedence than OR operator. That means the AND will be evaluated before the OR.)

The quick fix is to just add some parens, to override the default order of operations. Something like this:

WHERE p.id=u.id AND ( FIND_IN_SET('7',secondary_groups) OR primary_group = 7 )
   --               ^                                                        ^

The parens will cause the OR operation to be evaluated (as either TRUE, FALSE or NULL) and then the result from that will be evaluated in the AND.

Without the parens, it's the same as if the parens were here:

WHERE ( p.id=u.id AND FIND_IN_SET('7',secondary_groups) ) OR primary_group = 7 
   -- ^                                                 ^

With the AND condition evaluated first, and the result from that is operated on by OR. This is what is causing profile rows with a 7 to be matched to rows in user with different id values.


A few pointers on style:

  • avoid the old-school comma operator for join operations, and use the newer JOIN syntax

  • place the join predicates (conditions) in the ON clause, other filtering criteria in the WHERE clause

  • qualify all column references

As an example:

  SELECT u.id
       , u.username
       , p.id
       , p.facebook
       , p.steam
    FROM users u
    JOIN profile p
      ON p.id = u.id
   WHERE u.primary_group = 7
      OR FIND_IN_SET('7',u.secondary_groups) 
   ORDER BY u.id

We only need a GROUP BY clause if we want to "collapse" rows. If the id column is unique in both the users and profile tables, then there's no need for a GROUP BY u.id. We can add an ORDER BY clause if we want rows returned in a particular sequence.

Upvotes: 1

I don't know, what exactly do you want to do with output, but you can't group informations like this. MySQL isn't really a classic programming language, it's more like powerful tool for set mathematics. So if you want to get informations based on corelations between two or more tables, first you write a select statement which contains raw data which you want to work with, like this:

    SELECT * FROM users u INNER JOIN profile p ON p.id=u.id
    GROUP BY u.id;

Now you select relevant data with WHERE statement:

    SELECT * FROM users u INNER JOIN profile p ON p.id=u.id WHERE
    FIND_IN_SET(  '7', secondary_groups ) OR primary_group = 7
    GROUP BY u.id;

Now you should see grouped joined tables profile and users, and can start mining data. For example, if you want to count items in these groups, just add count function in SELECT and so on. When debugging SQL, I highly recommend these steps: 1.) First, you should write down all corelations between data, all foreign keys between tables, so you will know if your selection is fully deterministic. You can now start JOINing tables from left to right 2.) Try small bits of querys on model database. Then you will see which selection works right and which doesn't do what you expected.

Upvotes: 0

Related Questions