pjj
pjj

Reputation: 2135

Internal functioning of "prepareStatement(String sql)"

I have read the description of prepareStatement(String sql) and also lot of posts related to performance and caching of PreparedStatement and I am clear that database will parse and compile the query of the prepared statement so that for subsequent same queries, another round of parse and compile will not happen but I am not clear on whether:

Upvotes: 0

Views: 436

Answers (3)

Sanders the Softwarer
Sanders the Softwarer

Reputation: 2496

  1. Yes, every call of prepareStatement should result a database call. If no, it may be, say, this scenario:

    execute("create table x(y integer)");
    prepareStatement("select * from x"); // #1
    execute("rename table x to old_x");
    execute("create table x(z varchar(100))");
    prepareStatement("select * from x"); // #2 - this stamenent is not equal to previous
    
  2. PreparedStatement is generally a wrapper of cursor handle. prepareStatement sends SQL statement to RDBMS. RDBMS compiles it and returns a handle for it. Next database calls uses this handle so RDBMS will use compiled statement.

  3. It depends on RDBMS. Say, Oracle will use 'soft parse' in this case. Soft parse means than database locates equal statement in it's cache and uses it if possible. It is more effective than recompilation but less effective than using prepared statement.

Upvotes: 1

pjj
pjj

Reputation: 2135

Although @Sanders and @Douglas are good but they are not complete (even considering only JDBC driver), so I am also putting my answer, which is also not complete, I would recommend reading all 3 answers to get good knowledge:

Performance benefit of PreparedStatement

  • PreparedStatement objects are best suited for dynamic queries and also good from avoiding SQL injection perspective.
  • Caching with PreparedStatement:
    1. Caching at database level:
      • In the caching at database level, when you use a parameterized query with PreparedStatement object then for first execution db server will parse and compile the query and cache its execution plan, now when same parameterized query comes up again then all this things need not to be done again so you get performance benefit (note here that why you should prefer paramterized query and not “+” operator based query).
      • So, bottom line is that db server can cache the query so that it can avoid its parsing, compiling and identifying the execution plan time.
    2. Caching at J2EE server level:
      • Now, before I begin important thing to note is that this caching is only in case of J2EE servers, if you are having a standalone Java program then you cannot get this caching.
      • Now, in case of JEE servers, you get a pooled connection object, now when you create a prepared statement object from it then JEE server will cache this prepared statement object for that database connection (now, important thing to note here is that in case of JEE servers when you call the close method on connection object, the real connection with database server will not be closed, it will only close the proxy/wrapper connection object, so I think if you have set some property to the connection object then it will still be there), so when the same connection object is returned to the application and if case same query is used with the prepared statement object then JEE server will skip the round trip to db server and hence you will get performance benefit.
  • PreparedStatements are good from performance perspective because you can compile a query using PreparedStatement and then use this object to pass different parameters. Now, important thing is to contrast this with Statement object usage – you cannot set or pass parameters using Statement object so each time you have to create a new statement object (which means a round trip to db server) and then execute it (execute it means another round trip to db server). Now, if you see the case of PreparedStatement then you can create the PreparedStatement object once by specifiying your query parameter (this means round trip to db server) and then set different parameter on this object and execute it, now advantage you get here is that you will construct the PreparedStatement object only once which means only 1 round trip to db server and hence saving round trips to db server for object creation which was there in case of Statement object.
  • Using parameterized query vs “+” operator based query:
    • Now, it is said that you should use parameterized query with PreparedStatement object and not “+” operator based query, which is true, but important thing to note is that it is not the case that there will be no performance improvement, as soon as you use PreparedStatement object instead of Statement object then you get benefitted by avoid round trips to db server for object creation, so certainly there is benefit. But disadvantage of using “+” operator is at the db server side, if you use “+” operator then db server will not be able to cache the query which will happen when you use the parameterized query.
  • Another important thing to note is that when you call connection. prepareStatement() then if it is a standlone Java program then a round trip to server will happen while if it is a J2EE server then there is possibility that round trip to server will not happen because of JEE PreparedStatement cache

Upvotes: 0

Douglas Surber
Douglas Surber

Reputation: 696

The following applies only to the Oracle Database JDBC drivers. Other drivers may be different.

A call to Connection.prepareStatement(String) does not do a database round trip. It constructs a new PreparedStatement and stores the SQL String in it. That's about it.

Even unused PreparedStatements are moderately complex so this isn't an entirely trivial call. Even so, there isn't much value in caching unused PreparedStatements. The cost of constructing a new one is low. Emphasis on "unused". The first execution of a PreparedStatement does do a full round trip and completes constructing the PreparedStatement. Having used a PreparedStatement once the cost of reusing it is much less than the cost of creating a new one and using it for the first time.

Executing the same SQL in two PreparedStatements is more expensive for both the client and the server than executing a single PreparedStatement twice. For the client, the additional cost is pretty straightforward. It is the cost of fully constructing the second PreparedStatement, this includes the work the client does after the first execute round trip. Things are not so simple in the database.

The Oracle Database has multiple levels of caching and reuse to minimize the cost of executing a given SQL string multiple times. A PreparedStatement holds a cursor id. That id refers to a cursor in the server. A cursor is complex structure that is the database's representation of a SQL execution. Some of the structure can be shared by other cursors executing the same SQL. Some of the structure is unique to a single execution. Some structure can be shared among some cursors executing the SQL but not others. It is a complex system.

As a general rule creating a new PreparedStatement requires a hard parse. If the server has seen the SQL before then the hard parse may not be a full hard parse as the server can reuse some of the cursor structures. If the app re-executes a PreparedStatement then ideally the server doesn't have to do anything to the cursor; it just re-executes it. But there are many cases where the server has to do a soft parse before it can re-execute. A soft parse is less work than a hard parse, but not trivial.

The above ignores the Implicit Statement Cache. The Implicit Statement Cache stores the structures created by executing Prepared and Callable statements. If the app executes a SQL, closes the PreparedStatement, then creates a new PreparedStatement with the same SQL the structure created by the first execution is reused. The cost of the first execution of a PreparedStatement that was in the cache is for all practical purposes the same as re-executing the same PreparedStatement.

Upvotes: 2

Related Questions