Reputation: 154170
According to the JDBC specifications, the Statement.setMaxRows(int maxRows)
method is supposed to:
Sets the limit for the maximum number of rows that any ResultSet object generated by this Statement object can contain to the given number. If the limit is exceeded, the excess rows are silently dropped.
When testing it against limiting the result set at the SQL level (ROWSET, TOP and LIMIT), both the JDBC and the SQL construct seem to perform very well.
Even when selecting millions of rows, the setMaxRows
doesn't seem to perform worse.
Could it be because the database Executor might use a database cursor which only fetches records on demand, so when the driver reaches the maxRows
threshold, the database can be instructed to close the cursor?
This way, the database doesn't have to select a huge result set and send it to the wire, only to be discarded on the client-side.
Upvotes: 3
Views: 2453
Reputation: 248125
When you use setMaxRows()
, the PostgreSQL server will stop producing query result after the limit has been reached. So does not send all rows to the client.
However, that row limit is only imposed at query execution time, so the PostgreSQL optimizer doesn't know about it and will produce an execution plan that is optimized for fetching the whole result set, rather than a plan the produces the first rows quickly. Moreover, using setMaxRows()
will prevent parallel query execution, even if the optimizer planned it, which can also result in a suboptimal execution plan.
Prefer an explicit LIMIT
clause over setMaxRows()
whenever possible.
Upvotes: 0
Reputation: 5298
Oracle uses producer-consumer design pattern. So the rows are generated at the moment when client starts fetching from the cursor into the ResultSet. There are two optimizer goals: ALL_ROWS and FIRST_ROWS(resp. FIRST_ROWS(n)). When using first_rows optimizer goal Oracle tends to use more nested loops over hash_joins, and therefore it should return the first bulk of result data much faster. But I'm not sure whether using setMaxRows method also changes optimizer goal for the query.
Upvotes: 1
Reputation: 109242
Most JDBC drivers will fetch rows on demand (based on the fetch size), so usually maxRows
will be quite efficient. They usually even optimize to simply fetch no more than maxRows
.
A ROWS
or TOP
might give the database server some extra hints for optimizing the query, so setting maxRows
might not be as efficient as including the max in the query itself. The exact behaviour is driver and database dependent, so it is hard to generalize on behaviour and performance characteristics.
The notable exception is the MySQL driver (and probably also MariaDB) that defaults to fetching all rows (unless fetch size is set to Integer.MIN_VALUE
) immediately on query execution.
As an example in Jaybird (Firebird JDBC driver), the following is done (for TYPE_FORWARD_ONLY
):
public void fetch() throws SQLException {
synchronized (syncProvider.getSynchronizationObject()) {
checkClosed();
int maxRows = 0;
if (this.maxRows != 0) maxRows = this.maxRows - rowNum;
int fetchSize = this.fetchSize;
if (fetchSize == 0) fetchSize = MAX_FETCH_ROWS;
if (maxRows != 0 && fetchSize > maxRows) fetchSize = maxRows;
if (!allRowsFetched && (rows.isEmpty() || rows.size() == rowPosition)) {
rows.clear();
stmt.fetchRows(fetchSize);
rowPosition = 0;
}
if (rows.size() > rowPosition) {
setNextRow(rows.get(rowPosition));
// help the garbage collector
rows.set(rowPosition, null);
rowPosition++;
} else {
setNextRow(null);
}
}
}
As the server may decide to send more rows than requested, additional checks are made on next()
.
Upvotes: 1
Reputation: 325131
In PostgreSQL, PgJDBC sends a request at the protocol level that is the equivalent of appending a LIMIT
to the query. So the database server knows to try to minimise the amount of work it does where possible. It might choose a plan that'd be more expensive to fetch all rows, but that can start returning some rows sooner or avoid a big all-rows sort, for example.
I expect client drivers for other engines are similar - set a limit behind the scenes, or use a cursor and read until they have enough results.
Each DBMS and driver will be different, so a single definitive answer might be hard to find.
Upvotes: 3