Vastly different query run time in application

Question

I'm having a scaling issue with an application that uses a PostgreSQL 9 backend. I have one table who's size is about 40 million records and growing and the conditional queries against it have slowed down dramatically.

To help figure out what's going wrong, I've taken a development snapshot of the database and dump the queries with the execution time into the log.

Now for the confusing part, and the gist of the question ....

The run times for my queries in the log are vastly different (an order of magnitude+) that what I get when I run the 'exact' same query in DbVisualizer to get the explain plan.

I say 'exact' but really the difference is, the application is using a prepared statement to which I bind values at runtime while the queries I run in DbVisualizer has those values in place already. The values themselves are exactly as I pulled them from the log.

Could the use of prepared statements make that big of a difference?

Erwin Brandstetter · Accepted Answer

The answer is YES. Prepared statements cut both ways.

On the one hand, the query does not have to be re-planned for every execution, saving some overhead. This can make a difference or be hardly noticeable, depending on the complexity of the query.

On the other hand, with uneven data distribution, a one-size-fits-all query plan may be a bad choice. Called with particular values another query plan could be (much) better suited.

Running the query with parameter values in place can lead to a different query plan. More planning overhead, possibly a (much) better query plan.

Also consider unnamed prepared statements like @peufeu provided. Those re-plan the query considering parameters every time - and you still have safe parameter handling.

Similar considerations apply to queries inside PL/pgSQL functions, where queries can be treated as prepared statements internally - unless executed dynamically with EXECUTE. I quote the manual on Executing Dynamic Commands:

The important difference is that EXECUTE will re-plan the command on each execution, generating a plan that is specific to the current parameter values; whereas PL/pgSQL may otherwise create a generic plan and cache it for re-use. In situations where the best plan depends strongly on the parameter values, it can be helpful to use EXECUTE to positively ensure that a generic plan is not selected.

Apart from that, general guidelines for performance optimization apply.

Vastly different query run time in application

Answers (2)

Related Questions