Reputation: 253
#Issue# We are trying to optimize our dataserver application. It stores stocks and quotes over a mysql database. And we are not satisfied with the fetching performances.
#Context# - database - table stock : around 500 lines - table quote : 3 000 000 to 10 000 000 lines - one-to-many association : one stock owns n quotes - fetching around 1000 quotes per request - there is an index on (stockId,date) in the quote table - no cache, because in production, queries are always different - Hibernate 3 - mysql 5.5 - Java 6 - JDBC mysql Connector 5.1.13 - c3p0 pooling
#Tests and results# ##Protocol##
##Case 1 : Hibernate with association## This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)
session.enableFilter("after").setParameter("after", 1322910573000L);
Stock stock = (Stock) session.createCriteria(Stock.class).
add(Restrictions.eq("stockId", stockId)).
setFetchMode("quotes", FetchMode.JOIN).uniqueResult();
SQL generated :
SELECT this_.stockId AS stockId1_1_,
this_.symbol AS symbol1_1_,
this_.name AS name1_1_,
quotes2_.stockId AS stockId1_3_,
quotes2_.quoteId AS quoteId3_,
quotes2_.quoteId AS quoteId0_0_,
quotes2_.value AS value0_0_,
quotes2_.stockId AS stockId0_0_,
quotes2_.volume AS volume0_0_,
quotes2_.quality AS quality0_0_,
quotes2_.date AS date0_0_,
quotes2_.createdDate AS createdD7_0_0_,
quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
AND quotes2_.date > 1322910573000
WHERE this_.stockId='AAPL'
ORDER BY quotes2_.date ASC
Results :
##Case 2 : Hibernate without association without HQL## Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated info about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.
String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());
SQL generated :
SELECT *
FROM quote q
WHERE stockId='AAPL'
AND q.date>1322910573000
ORDER BY q.date ASC
Results :
##Case 3 : JDBC without Hibernate##
String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
Connection conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
while(rs.next())
{
stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
}
stmt.close();
conn.close();
Results :
#Our understandings#
#Our questions#
Your help is very welcome.
Upvotes: 21
Views: 5723
Reputation: 340903
Can you do a smoke test with the simples query possible like:
SELECT current_timestamp()
or
SELECT 1 + 1
This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.
Is there a way to optimize the performance of JDBC driver ?
Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection()
uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.
Also prefer named queries to ad-hoc querying or criteria query.
And will Hibernate benefit this optimization ?
Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate
with RowMapper
abstraction.
Is there a way to optimize Hibernate performance when converting result sets ?
Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.
However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.
Check out: Caching with Hibernate + Spring - some Questions!
Are we facing something not tunable because of Java fundamental object and memory management ?
JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.
Are we missing a point, are we stupid and all of this is vain ?
I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.
Upvotes: 7