Reputation: 20310
Based on my reading, I see that the way to stream a ResultSet
in MySQL using the MySQL JDBC driver is these two commands:
stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
My question is could an expert clarify if streaming the ResultSet using above code returns one row to client, then go to server to fetch next row and so on (terribly inefficient) or whether it is smart enough to do buffered streaming like a BufferedStreamReader
? If it does buffered streaming, how to set the buffer size?
EDIT: From the doc:
The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this, any result sets created with the statement will be retrieved row-by-row.
Does this mean that if I have 10M rows then there are 10M roundtrips to the server to get these rows? This is terribly inefficient. How can I stream the ResultSet
but have it buffered so that I don't have to make so many roundtrips?
EDIT2: It seems MySQL does some buffering automatically when fetchSize is set to Integer.MIN_VALUE. In my test I was able to read more than 40M rows in less than 20 minutes using setFetchSize(Integer.MIN_VALUE)
. This translates to about 30,000 rows per second. I don't know how big average row was but its hard to imagine 30,000 roundtrips per second.
Also a separate question: what does MySQL do if the result set has more elements than the fetchSize? e.g., result set has 10M rows and fetchSize is set to 1000. What happens then?
Upvotes: 3
Views: 2464
Reputation: 123399
It seems MySQL does some buffering automatically when fetchSize is set to Integer.MIN_VALUE.
It does, at least sometimes. I tested the behaviour of MySQL Connector/J version 5.1.37 using Wireshark. For the table ...
CREATE TABLE lorem (
id INT AUTO_INCREMENT PRIMARY KEY,
tag VARCHAR(7),
text1 VARCHAR(255),
text2 VARCHAR(255)
)
... with test data ...
id tag text1 text2
--- ------- --------------- ---------------
0 row_000 Lorem ipsum ... Lorem ipsum ...
1 row_001 Lorem ipsum ... Lorem ipsum ...
2 row_002 Lorem ipsum ... Lorem ipsum ...
...
999 row_999 Lorem ipsum ... Lorem ipsum ...
(where both `text1` and `text2` actually contain 255 characters in each row)
... and the code ...
try (Statement s = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY)) {
s.setFetchSize(Integer.MIN_VALUE);
String sql = "SELECT * FROM lorem ORDER BY id";
try (ResultSet rs = s.executeQuery(sql)) {
... immediately after the s.executeQuery(sql)
– i.e., before rs.next()
is even called – MySQL Connector/J had retrieved the first ~140 rows from the table.
In fact, when querying just the tag
column
String sql = "SELECT tag FROM lorem ORDER BY id";
MySQL Connector/J immediately retrieved all 1000 rows as shown by the Wireshark list of network frames:
Frame 19, which sent the query to the server, looked like this:
The MySQL server responded with frame 20, which started with ...
... and was immediately followed by frame 21, which began with ...
... and so on until the server had sent frame 32, which ended with
Since the only difference was the amount of information being returned for each row, we can conclude that MySQL Connector/J decides on an appropriate buffer size based on the maximum length of each returned row and the amount of free memory available.
what does MySQL do if the result set has more elements than the fetchSize? e.g., result set has 10M rows and fetchSize is set to 1000. What happens then?
MySQL Connector/J initially retrieves the first fetchSize
group of rows, then as rs.next()
moves through them it will eventually retrieve the next group of rows. That is true even for setFetchSize(1)
which, incidentally, is the way to really get only one row at a time.
(Note that setFetchSize(n)
for n>0 requires useCursorFetch=true
in the connection URL. That is apparently not required for setFetchSize(Integer.MIN_VALUE)
.)
Upvotes: 4