Reputation: 1251
My end goal...SELECT
all fields from qry1
on rows where either the home phone, cell phone, OR work phone match tbl2
This is my current SQL syntax that's "failing." By failing I mean it hasn't finished executing after 10-15 minutes while the individual joins each (independently) run in a minute or two a piece.
SELECT qry1.*
FROM qry1 INNER JOIN tbl2
ON ((qry1.CellPhone = tbl2.CellPhone)
OR (qry1.HomePhone = tbl2.HomePhone)
OR (qry1.WorkPhone = tbl2.WorkPhone));
Questions: Are there errors in my SQL syntax? Is there a better way to complete my task? Would it make more sense for me to simply run the 3 (Home, work, cell) JOIN queries independently, UNION them, and then dedupe if necessary?
Upvotes: 3
Views: 3789
Reputation: 38367
If this is essentially a self join, where the same people exist in both tbl1 and qry1, then you will probably not be interested where people match themselves, as well as duplicate permutations of the results. What I mean, is self joins usually have two records where Bob and Jane have the same phone number, and thus your result has joined records of:
Bob, Jane
Jane, Bob
Jane, Jane
Bob, Bob
Out of those you probably only need Bob, Jane
since it tells you that those two people have a matching phone number. If you have some unique ID then adding where qry1.SSN < tbl2.SNN
will eliminate many of these duplicates and reduce your result set. Preferably this column would have an index.
Upvotes: 1
Reputation: 1182
The following is similar to https://stackoverflow.com/a/27696729/4350148 but uses disjoint union. May be more efficient
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.CellPhone = tbl2.CellPhone
UNION distinct
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON
qry1.CellPhone != tbl2.CellPhone
and qry1.HomePhone = tbl2.HomePhone
UNION distinct
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON
qry1.CellPhone != tbl2.CellPhone
and qry1.HomePhone != tbl2.HomePhone
and qry1.WorkPhone = tbl2.WorkPhone
Upvotes: 0
Reputation: 753805
If the individual joins work fast, the composite OR condition is probably slow because it cannot use a single index whereas the individual conditions can each use a single index for the three joining conditions. Because it can't use one index, it is probably doing non-indexed sequential table scans. (You should study the query plan so you understand what the optimizer is actually doing.)
Given that the individual queries work reasonably fast, therefore, you should get dramatically better performance using UNION (unless the optimizer in your DBMS has a blind spot):
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.CellPhone = tbl2.CellPhone
UNION
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.HomePhone = tbl2.HomePhone
UNION
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.WorkPhone = tbl2.WorkPhone
That should give you a result approximately as quickly as the 3 individual queries. It won't be quite as fast because UNION does duplicate elimination (which the individual queries do not, of course). You could use UNION ALL, but if there are many rows in the two tables where 2 or 3 of the pairs of fields match, that could lead to a lot of repetition in the results.
Upvotes: 6
Reputation: 1269803
or
can be rather difficult for SQL optimizers. I would suggest 3 indexes and the folllowing query:
SELECT qry1.*
FROM qry1
WHERE EXISTS (SELECT 1 FROM tbl2 WHERE qry1.CellPhone = tbl2.CellPhone) OR
EXISTS (SELECT 1 FROM tbl2 WHERE qry1.HomePhone = tbl2.HomePhone) OR
EXISTS (SELECT 1 FROM tbl2 WHERE qry1.WorkPhone = tbl2.WorkPhone);
The three indexes are tbl2(CellPhone)
, tbl2(HomePhone)
, and tbl2(WorkPhone)
.
Upvotes: 2