Reputation: 63
What is the optimal solution, use Inner Join or multiple queries?
something like this:
SELECT * FROM brands
INNER JOIN cars
ON brands.id = cars.brand_id
or like this:
SELECT * FROM brands
... (while query)...
SELECT * FROM cars WHERE brand_id = [row(brands.id)]
Upvotes: 0
Views: 158
Reputation: 425033
One query is better, because up to about 90% of the expense of executing a query is in the overheads:
Do all that just once for one query, or do it all n times for n queries, but get the same data.
Upvotes: 1
Reputation: 37365
In general, use first query. Why? Because query execution time is not just query itself time, but also some overheads, such as:
Depending of situation, some overheads may present or not. For example, if you're using persistent connection, then you'll not get connection overhead. But in common case that's not true, thus, it will have place. And creating/maintaining/closing connection overhead is very significant part. Imagine that you have this overhead as only 1% from total query time (in real situation it will be much more). And you have - let's say, 1.000.000 rows. Then first query will produce that overhead only once, while second will be 1.000.000/100 = 10.000
times. Just think about - how slow it will be.
Besides, INNER JOIN
will also be done using key - if it exists, thus, in terms of query itself speed it will be near same as second. So I highly recommend to use INNER JOIN
option.
Breaking complex query into simple queries may be useful in a very specific cases. For example, case with IN
subquery. In this situation, if you're using WHERE id IN (subquery)
, where (subquery)
is some SQL, MySQL will treat that as = ANY
subquery and will not use key for that, even if subquery results in narrow list of ids. And - yes, split it into two queries may have sense since WHERE IN(static list)
will work in another way - MySQL will use range index scan for that (strange, but true - because for IN (static list)
statement IN
will be treated as comparison operator, and not =ANY
subquery qualifier). This part isn't directly about your case - but to show that - yes, cases, when splitting processing from DBMS may be useful in terms of performance - exist.
Upvotes: 1
Reputation: 1269773
Your two queries are not exactly the same.
The first returns all fields from brands
and cars
in one row. The second returns two different result sets that need to be combined together.
In general, it is better to do as many operations in the database as possible. The database is more efficient for processing large amounts of data. And, it generally reduces the amount of data being brought back to the client.
That said, there are a few circumstances where more data is being returned in a single query than with multiple queries. For instance in your example, if you have one brand record with 100 columns and 10,000 car records with three columns, then the two-query method is probably faster. You are only bringing back the columns from brands
table once rather than 10,000 times.
These examples where multiple queries is better are few and far between. In general, it is better to do the processing in the database. If performance needs to be improved, then in a few rare cases, you might be able to break up queries and improve performance.
Upvotes: 2
Reputation: 1489
Generally speaking, one query is better, but there are come caveats to that. For example, older versions of SQL Server had a great decreases in performance if you did more than seven joins. The answer will really depend on the database engine, version, query, schema, fields, etc., so we can't say for sure which is better. Always look into minimizing the number of queries when possible without going too overboard and creating result sets that are crazy or impossible to maintain.
Upvotes: 3
Reputation: 10191
This is a very subjective question but remember that each time you call the database there's significant overhead.
Almost without exception the optimum is to issue as few commands and pull out all the data you'll need. However for practical reasons this clearly may not be possible.
Generally speaking if a database is well maintained one query is quicker than two. If it's not you need to look at your data/indicies and determine why.
A final point, you're hinting in your second example that you'd load the brand then issue a command to get all the cars in each brand. This is without a doubt your worst option as it doesn't issue 2 commands - it issues N+1 where N is the number of brands you have... 100 brands is 101 DB hits!
Upvotes: 2