RTF
RTF

Reputation: 6494

Unusual MySQL behavior with select query

I have 3 tables A, B and C:

I'm running a query like this:

SELECT A.Id FROM A, B, C 
WHERE A.Id = B.SomeId OR (A.Id = C.SomeId AND C.SomeValue = 'X') 
INTO OUTFILE '/tmp/result.txt';

I thought this would only have to iterate over each Id in table A (1000 rows) and then potentially query across the other tables (depending on whether MySQL short circuits, I don't know if it does).

But the query seems to hang, or at least it's taking a very long time. Much longer than I would have expected if it only had to iterate 1000 rows. 10 minutes in and the output file is still empty. Let me know if I can provide any more information.

my@laptop$ mysql --version
mysql  Ver 14.14 Distrib 5.5.37, for debian-linux-gnu (i686) using readline 6.3

Edit:
The result I'm looking for is 'Give me all the Id's in table A where the Id matches B.SomeId OR ELSE the Id matches C.SomeId AND C.SomeValue equals 'X'.

Upvotes: 0

Views: 85

Answers (3)

Jenn
Jenn

Reputation: 795

Let's go even smaller. Let's say that your tables look like this:

A.ID
1
2

B.SomeID
1
3

C.SomeID | C.SomeValue
1        |   X
2        |   X

Now, let's see what your query will do. First, we look to see if A.ID match and B.SomeID match. In the case of A.ID = 1, we have a match! Sql short circuits. This means that if the first part of your or is true, sql doesn't evaluate the 2nd part of your or. Now, we still have to join with table C. Since there is no join condition, for table C sql matches A.ID with all the columns in table C.

Now we need to compare A.ID with the next row in B. Well, 1 <> 3. So, we move on to the second part of the or. When C.SomeID = 1, the row is included. When C.SomeID = 2, the row is not included. Your results for A.ID = 1 are:

A.ID    | B.SomeID   | C.SomeID   | C.SomeValue
1       | 1          | 1          |  X
1       | 1          | 2          |  X
1       | 3          | 1          |  X

This is clearly not the results table that you are looking for. Since you are going to join A with either table B or C, instead of an or, you should use a union

SELECT A.Id FROM A, B
WHERE A.Id = B.SomeId

Union All

Select A.ID From A, C
 Where A.Id = C.SomeId AND C.SomeValue = 'X'

Union all puts the results from the first query into the same results table as the results from the second query. Now, your question says that you only want the A.IDs that are in one table but not the other (or else). There are several ways to do this. In this case, I am going to use a having and a subquery. You could also use a not exists but I believe that having is going to use less resources.

Select T.ID
From
(SELECT A.Id FROM A, B
WHERE A.Id = B.SomeId

Union All

Select A.ID From A, C
Where A.Id = C.SomeId AND C.SomeValue = 'X') T
Group By T.ID
Having count(1) = 1

We only want the Ids that show up exactly one time. This will only work if the id is not repeated in B or C, so keep that in mind. Since the condition is based on the aggregate function, count, this stipulation must be in the having.

Upvotes: 2

Barmar
Barmar

Reputation: 780724

OR expressions often make it difficult for MySQL to use indexes. Try changing to a UNION:

SELECT A.id
FROM A
JOIN B ON A.id = B.SomeID
UNION
SELECT A.id
FROM A
JOIN C ON A.id = C.SomeID
WHERE C.SomeValue = 'A'

From the documentation:

Minimize the OR keywords in your WHERE clauses. If there is no index that helps to locate the values on both sides of the OR, any row could potentially be part of the result set, so all rows must be tested, and that requires a full table scan. If you have one index that helps to optimize one side of an OR query, and a different index that helps to optimize the other side, use a UNION operator to run separate fast queries and merge the results afterward.

Your query is described by the last sentence: you have different indexes for each side of the OR query.

Upvotes: 5

Matt
Matt

Reputation: 1155

I'm not strong in MySQL, but I think this would work better:

SELECT Id FROM
( SELECT A.Id FROM A, B
  WHERE A.Id = B.SomeId
  UNION
  SELECT A.Id FROM A, C 
  WHERE A.Id = C.SomeId AND C.SomeValue = 'X'
) X
INTO OUTFILE '/tmp/result.txt';

Upvotes: 0

Related Questions