Reputation: 63
Hoping someone can help me understand if I'm seeing an issue or if I just don't understand mongodb tailable cursor behavior. I'm running mongodb 2.0.4 and pymongo 2.1.1.
Here is an script that demonstrates the problem.
#!/usr/bin/python
import sys
import time
import pymongo
MONGO_SERVER = "127.0.0.1"
MONGO_DATABASE = "mdatabase"
MONGO_COLLECTION = "mcollection"
mongodb = pymongo.Connection(MONGO_SERVER, 27017)
database = mongodb[MONGO_DATABASE]
if MONGO_COLLECTION in database.collection_names():
database[MONGO_COLLECTION].drop()
print "creating capped collection"
database.create_collection(
MONGO_COLLECTION,
size=100000,
max=100,
capped=True
)
collection = database[MONGO_COLLECTION]
# Run this script with any parameter to add one record
# to the empty collection and see the code below
# loop correctly
#
if len(sys.argv[1:]):
collection.insert(
{
"key" : "value",
}
)
# Get a tailable cursor for our looping fun
cursor = collection.find( {},
await_data=True,
tailable=True )
# This will catch ctrl-c and the error thrown if
# the collection is deleted while this script is
# running.
try:
# The cursor should remain alive, but if there
# is nothing in the collection, it dies after the
# first loop. Adding a single record will
# keep the cursor alive forever as I expected.
while cursor.alive:
print "Top of the loop"
try:
message = cursor.next()
print message
except StopIteration:
print "MongoDB, why you no block on read?!"
time.sleep(1)
except pymongo.errors.OperationFailure:
print "Delete the collection while running to see this."
except KeyboardInterrupt:
print "trl-C Ya!"
sys.exit(0)
print "and we're out"
# End
So if you look at the code, it is pretty simple to demonstrate the issue I'm having. When I run the code against an empty collection (properly capped and ready for tailing), the cursor dies and my code exits after one loop. Adding a first record in the collection makes it behave the way I'd expect a tailing cursor to behave.
Also, what is the deal with the StopIteration exception killing the cursor.next() waiting on data? Why can't the backend just block until data becomes available? I assumed the await_data would actually do something, but it only seems to keep the connection waiting a second or two longer than without it.
Most of the examples on the net show putting a second While True loop around the cursor.alive loop, but then when the script tails an empty collection, the loop just spins and spins wasting CPU time for nothing. I really don't want to put in a single fake record just to avoid this issue on application startup.
Upvotes: 4
Views: 3433
Reputation: 26258
This is known behavior, and the 2 loops "solution" is the accepted practice to work around this case. In the case that the collection is empty, rather than immediately retrying and entering a tight loop as you suggest, you can sleep for a short time (especially if you expect that there will soon be data to tail).
Upvotes: 1