Reputation: 7144
I've been looking around to find how to do this but can't find it.
I have a table of purchases
that contains barcode
and user_id
fields.
This table has many rows of barcodes and ids, barcodes are all different, but user_ids can be multiple, like this:
I than have a different table of accountings
with barcodes and NO user_ids.
I have to find the barcodes from purchases
and extract data from accountings
(where the barcode found in purchases
is the same as the UPC OR ISRC [fields] in accountings
)
I've tried like this:
SELECT *
FROM accountings
WHERE UPC IN(SELECT barcode
FROM purchases
WHERE user_id = '740')
But it runs endlessly and doesn't work.
Any way to get this right?
Upvotes: 0
Views: 141
Reputation: 108370
The "foo IN (correlated subquery)"
predicate is going to each your lunch, along with your lunchbox, in terms of query performance, because of the way MySQL processes it. The subquery will get executed for each and every row in the outer query. With a large set from the outer query, that turns into a whole lot of subquery executions.
For improved performance, if you have guaranteed uniqueness of the "(barcode,userid)"
tuple in purchases
table (which doesn't seem likely) a simple join will give you the specified resultset:
SELECT a.*
FROM accountings a
JOIN purchases p
ON p.barcode = a.upc
AND p.user_id = '740'
That query will return "duplicate" rows from accountings
if there are multiple rows in purchases
with the same barcode for user_id = 740.
To fix that, one option is to add "GROUP BY"
clause to collapse the duplicate rows; you just need a column, or set of columns, guaranteed to be unique in the accountings
table. If you have a surrogate primary key named id
for example:
SELECT a.*
FROM accountings a
JOIN purchases p
ON p.barcode = a.upc
AND p.user_id = '740'
GROUP BY a.id
But, in the more general case, an alternate approach is to use an inline view to get a list of distinct barcode values for a given user, and the use a JOIN operation to that set of barcode values:
SELECT a.*
FROM accountings a
JOIN ( SELECT p.barcode
FROM purchases p
WHERE p.user_id = '740'
GROUP BY p.barcode
) q
ON q.barcode = a.upc
The inline view aliased as "q
" gets a distinct list of barcode values for the specified user, and the rows from that query are used like a table (MySQL actually calls it a "derived table") by the outer query.
These aren't the only ways to get the resultset, but they are common approaches.
Appropriate indexes will improve performance with large sets:
... ON accountings (upc)
... ON purchases (user_id, barcode)
FOLLOWUP
The output from an EXPLAIN will show us the execution plan MySQL will use to execute the query. It's possible that the last query in my answer could reverse the order of the tables, to influence the execution plan:
SELECT a.*
FROM ( SELECT p.barcode
FROM purchases p
WHERE p.user_id = '740'
GROUP BY p.barcode
) q
JOIN accountings a
ON a.upc = q.barcode
I thought the MySQL optimizer is smart enough to figure that out without us having to change the order of the tables; best clues to performance improvements are usually found in the output from an EXPLAIN SELECT ...
.
Upvotes: 1
Reputation: 669
You should probably try a JOIN. (mysql JOIN). The following will give you a table matched by barcode:
SELECT * FROM accountings a
LEFT JOIN purchases p ON a.UPC=p.barcode
Now that you have that table, you can build the query up to the full size you want it to be:
SELECT *
FROM accountings a
LEFT JOIN purchases p ON a.UPC=p.barcode
WHERE p.user_id='740'
I hope this also makes the concept of a LEFT JOIN clear. It'll add tables together and browse through combined data. It'll get all data though, from both accountings and purchases.
But I think Nath got it better by saying you probably have an int.
Upvotes: 0