Lucas Mouilleron
Lucas Mouilleron

Reputation: 253

Hibernate, JDBC and Java performance on medium and big result set

#Issue# We are trying to optimize our dataserver application. It stores stocks and quotes over a mysql database. And we are not satisfied with the fetching performances.

#Context# - database - table stock : around 500 lines - table quote : 3 000 000 to 10 000 000 lines - one-to-many association : one stock owns n quotes - fetching around 1000 quotes per request - there is an index on (stockId,date) in the quote table - no cache, because in production, queries are always different - Hibernate 3 - mysql 5.5 - Java 6 - JDBC mysql Connector 5.1.13 - c3p0 pooling

#Tests and results# ##Protocol##

##Case 1 : Hibernate with association## This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)

session.enableFilter("after").setParameter("after", 1322910573000L);
Stock stock = (Stock) session.createCriteria(Stock.class).
add(Restrictions.eq("stockId", stockId)).
setFetchMode("quotes", FetchMode.JOIN).uniqueResult();

SQL generated :

SELECT this_.stockId AS stockId1_1_,
       this_.symbol AS symbol1_1_,
       this_.name AS name1_1_,
       quotes2_.stockId AS stockId1_3_,
       quotes2_.quoteId AS quoteId3_,
       quotes2_.quoteId AS quoteId0_0_,
       quotes2_.value AS value0_0_,
       quotes2_.stockId AS stockId0_0_,
       quotes2_.volume AS volume0_0_,
       quotes2_.quality AS quality0_0_,
       quotes2_.date AS date0_0_,
       quotes2_.createdDate AS createdD7_0_0_,
       quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
AND quotes2_.date > 1322910573000
WHERE this_.stockId='AAPL'
ORDER BY quotes2_.date ASC

Results :

##Case 2 : Hibernate without association without HQL## Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated info about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());

SQL generated :

SELECT *
FROM quote q
WHERE stockId='AAPL'
  AND q.date>1322910573000
ORDER BY q.date ASC

Results :

##Case 3 : JDBC without Hibernate##

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
Connection conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
while(rs.next())
{
    stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
}
stmt.close();
conn.close();

Results :

#Our understandings#

#Our questions#

Your help is very welcome.

Upvotes: 21

Views: 5723

Answers (1)

Tomasz Nurkiewicz
Tomasz Nurkiewicz

Reputation: 340903

Can you do a smoke test with the simples query possible like:

SELECT current_timestamp()

or

SELECT 1 + 1

This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.

Is there a way to optimize the performance of JDBC driver ?

Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection() uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.

Also prefer named queries to ad-hoc querying or criteria query.

And will Hibernate benefit this optimization ?

Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate with RowMapper abstraction.

Is there a way to optimize Hibernate performance when converting result sets ?

Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.

However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.

Check out: Caching with Hibernate + Spring - some Questions!

Are we facing something not tunable because of Java fundamental object and memory management ?

JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.

Are we missing a point, are we stupid and all of this is vain ?

I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.

Upvotes: 7

Related Questions