Reputation: 218
We are trying to make an Application that returns paginated results from cassandra db for a UI.
UI would pass fetchSize
and pagingState
to our API and based on that we would return a List<MyObject>
of size=fetchSize
. If pagingState
is passed we would resume the query from last page (as mentioned in cassandra docs : https://docs.datastax.com/en/developer/java-driver/3.6/manual/paging/)
Please note that I'm using Cassandra driver version 3.6.
But when we implemented this, Cassandra always returns all entries in the database ignoring the fetch size, which in turn results null
value for ResultSet.getExecutionInfo().getPagingState()
. How do I solve this?
I created 16 records in my database for MyObject
and tried passing fetch size as 5 to get them. All 16 records have same partition key ID-1
.
// Util method to invoke Statement. "session" is cassandra session
public static ResultSet execute(int pageSize, Statement statement, String pageState) {
if (isVoid(pageSize)) {
pageSize=-1;
}
statement.setFetchSize(pageSize);
if (!isVoid(pageState)) {
statement.setPagingState(PagingState.fromString(pageState));
}
return session.execute(statement);
}
// Accesor interface method for my query that returns a Statement
object
@Query("SELECT * FROM " + MY_TABLE + " WHERE id=:id")
Statement getAll(@Param("id") String id);
// Main Code returning list of MyObject that has an object Mapper ->
//mapper
Statement statement=accessor.getAll("ID1");
ResultSet rs=execute(5,statement,null );
List<MyObject> list=mapper.map(rs).all();
String pageState=rs.getExecutionInfo().getPagingState();
In the above code, I expected Cassandra to return a list of 5 MyObject
objects and have a string value for my pageState
variable.
Neither worked as expected.
List had a size of 16 (Basically it fetched all records)
and because of above, pageState
was null
as all records were already fetched.
What am I missing here?
EDIT:
From observation ResultSet
will honour fetchSize passed in the statement, but when we map it to List<MyObject>
using all()
method, it fetches all the results in the database(of size = Cluster wide fetchSize).
So when I invoked Result#one
method 5(= pageSize
) times and pushed them in a List, I got the paging state as well as results of size page size.
Sample Util method for above
public static <T> List<T> getPaginatedList(ResultSet resultSet, Mapper<T> mapper,int pageSize) {
List<T> entities=new ArrayList<>();
Result<T> result=mapper.map(resultSet);
IntStream.range(1,pageSize).forEach(i->{
entities.add(result.one());
});
return entities;
}
What is the performance impact of this?
Upvotes: 2
Views: 1425
Reputation: 11638
As you were able to discern, the reason you are getting all results back despite the fact that you are specifying setFetchSize
is because fetch size simply sets the requested size of each requested page. When you invoke all()
, the driver transparently pages through all results.
Calling one()
individually will not have a performance impact when compared to all()
, however I would recommend changing your logic for consuming the page as I would expect IntStream.range(1, pageSize)
to fail if you've exhausted your result set (i.e. you set fetch size to 500, but there are only 495 rows). Instead you could use IntStream.range(1, resultSet.getAvailableWithoutFetching())
.
You could also choose to iterate over the result set until ResultSet.isExhausted()
returns true to prevent fetching the next page.
Upvotes: 2