RocketScienceGuy
RocketScienceGuy

Reputation: 161

Does a complex sql query ever become complex enough that it would be more machine-efficient to do multiple queries?

I have this sense that - inefficiently written queries aside - getting the information you want out of a database is always faster the less queries you make to do so. I don't know where I got that idea from and it gets challenged the more complicated the queries are that I produce (am I really doing MySQL any favors with all these joins?). I'm not asking for an opinion on ease for the programmer or best coding practices, but do conditions exist under which a program would perform faster with a query broken out into multiple steps? If so, how might one make an educated guess a query might reach such an upper limit before going through the effort of coding and comparing?

Upvotes: 0

Views: 48

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269753

Yes, although it is less likely with MySQL. The reason is that MySQL doesn't have a really sophisticated cost-based optimizer. The advantage to intermediate tables is that the sizes are known. A cost-based optimizer can take advantage of this information to improve the query plan.

One place where this can help is when a subquery is repeated multiple times in a query. An intermediate table ensures that it is processed only once (although CTEs would normally do the same thing).

Another place where this can really help is when you add indexes to the intermediate tables. Adding the indexes -- and using them -- can be a big cost savings, more than making up for the cost of creating the index.

That said, I generally discourage using intermediate tables for this purpose, unless the results are needed for multiple queries. I find that just the overhead in debugging makes it not worth it -- for some reason, I don't always delete the intermediate tables and then waste time wondering why some modification doesn't work.

More importantly, as the data changes, modifying the queries can be a pain. I find that changing a column name, for instance, is simpler in a single query than when the logic is spread across multiple queries.

Upvotes: 1

Related Questions