Reputation: 2446
I have a collection with 500K+ documents which is stored on a single node mongo. Every now and then my pymongo cursor.find() fails as it times out.
While I could set the find
to ignore timeout, I do not like that approach. Instead, I tried a generator (adapted from this answer and this link):
def mongo_iterator(self, cursor, limit=1000):
skip = 0
while True:
results = cursor.find({}).sort("signature", 1).skip(skip).limit(limit)
try:
results.next()
except StopIteration:
break
for result in results:
yield result
skip += limit
I then call this method using:
ref_results_iter = self.mongo_iterator(cursor=latest_rents_refs, limit=50000)
for ref in ref_results_iter:
results_latest1.append(ref)
The problem: My iterator does not return the same number of results. The issue is that next() advances the cursor. So for every call I lose one element...
The question: Is there a way to adapt this code so that I can check if next exists? Pymongo 3x does not provide hasNext() and 'alive' check is not guaranteed to return false.
Upvotes: 5
Views: 5409
Reputation: 61225
The .find()
method takes additional keyword arguments. One of them is no_cursor_timeout
which you need to set to True
cursor = collection.find({}, no_cursor_timeout=True)
You don't need to write your own generator function. The find()
method returns a generator like object.
Upvotes: 3
Reputation: 60974
Why not use
for result in results:
yield result
The for loop should handle StopIteration
for you.
Upvotes: 1