Reputation: 231
I am executing find query in mongodb using java on a collection with batchsize set to 500. My collection has 10,000 records but with batchsize set i get only 1-500 records. How do I get the next set of records?
DBCursor cursor = collection.find(query).batchSize(batchSize);
while(cursor.hasNext()) {
// write to file.
DBObject obj = cursor.next();
objectIdList.add(obj.get("_id"));
}
Upvotes: 6
Views: 12293
Reputation: 47865
The DBCursor
allows you to iterate over the set of documents which are deemed relevant to the query
to passed into the find()
method. It lazily fetches these documents from the underlying database in chunks of batchSize.
So, with the default batch size (101, IIRC) it will return the first 101 documents to your client and then as your client code iterates beyond the 101st document it will (behind the scenes) grab the next 101 documents and so on until whichever of the following occurs first:
The same applies when you set an explicit batchSize so in your case when you set batchSize=500
, the find()
call returns a DBCursor
which contains (at most) 500 documents and if there were more than 500 documents matching your query then as you iterate beyond the 500th document the MongoDB Java driver would (behind the scenes) fetch the next batch.
You stated ...
My collection has 10,000 records but with batchsize set i get only 1-500 records
... if you only get 500 documents then either you stopped iterating after 500 or only 500 documents were deemed relevant to your query
.
You can see how many documents are relevant to your query by using the count()
method. For example:
int count = collection.find(query).count();
You can also grab all of the documents relevant to your query in one go without using a DBCursor
like this ...
List<DBObject> obj = collection.find(query).toArray();
... though of course this might have implications for your application's heap since it would result in every document which meets your criteria being stored on-heap in your client (rather than the more memory friendly approach of reading them in batches via the DBCursor
).
Upvotes: 8
Reputation: 8358
You can use skip
method to achieve this:
like:
collection.find(query).batchSize(batchSize).skip(500)
Upvotes: 0