robert_difalco
robert_difalco

Reputation: 4914

PostgreSQL and JDBC Prepared Statement Caching

I've been reading that some of the newer JDBC Connection Pools (like Tomcat) do not support client side statement pooling. I already read that this is because most JDBC drivers maintain their own statement cache. However, I do not see this happening for PostgreSQL.

Am I correct on this? If so should I be using a connection pool that CAN cache prepared statements in order to get the best batched insert performance?

Thanks

Upvotes: 2

Views: 2998

Answers (1)

vanOekel
vanOekel

Reputation: 6548

Tomcat's JDBC connection pool provides a StatementCache. I don't know how to use it (with JPA) but it promises to "cache PreparedStatement and/or CallableStatement instances on a connection."

But for insert batches you probably don't need to re-use a PreparedStatement: you can use the addBatch method as shown in this example. Odd thing is though, that the official documentation says that "this method cannot be called on a PreparedStatement or CallableStatement". I guess you'll have to try it out to find out if that is true and as Craig Ringer comments: "PgJDBC just sends them individually anyway".

As for performance: that breaks down in client, server and driver/network traffic.
Database servers generally learn what queries to expect from connections (which is also why re-using connections via a connection pool is a good idea). I believe the PostgreSQL database server will start to remember queries from a connection once the same query occurs more than 5 times.
If the driver properly handles PreparedStatements that are re-used by the client, the driver can decide to only send the new data and not the whole query. This can have a significant positive performance effect for insert statements (and I believe that, among others, the JDBC driver for SQL Server does this).
If the client caches PreparedStatements (e.g. statement is only closed when the connection that created it is closed) it will help in not executing the same code (also from the driver) again and again. I.e. less initialization time and less garbage collection.

As example for an alternative to Tomcat's JDBC connection pool: I have used Yapool's SimpleQueryCache (example) with a MySQL database (and JDBC driver) in a little server component that fired about 15 different queries. During stress/load testing I was surprised by the relative low number for the maximum amount of connections required in the pool that did not impact the speed of the little server component (i.e. connections were borrowed from the pool for a relative short amount of time). So, at least in some situations, client side caching of statements can make a difference.

On a side node: if you use something like Hibernate a lot of optimalizations are done for you by Hibernate (under water) and there is a good chance that (statement) caching is already done for you.

Upvotes: 0

Related Questions