DaGoodBoy
DaGoodBoy

Reputation: 63

Using pymongo tailable cursors dies on empty collections

Hoping someone can help me understand if I'm seeing an issue or if I just don't understand mongodb tailable cursor behavior. I'm running mongodb 2.0.4 and pymongo 2.1.1.

Here is an script that demonstrates the problem.

#!/usr/bin/python

import sys
import time
import pymongo

MONGO_SERVER = "127.0.0.1"
MONGO_DATABASE = "mdatabase"
MONGO_COLLECTION = "mcollection"

mongodb    = pymongo.Connection(MONGO_SERVER, 27017)
database   = mongodb[MONGO_DATABASE]

if MONGO_COLLECTION in database.collection_names():
  database[MONGO_COLLECTION].drop()

print "creating capped collection"
database.create_collection(
  MONGO_COLLECTION,
  size=100000,
  max=100,
  capped=True
)
collection = database[MONGO_COLLECTION]

# Run this script with any parameter to add one record
# to the empty collection and see the code below
# loop correctly
#
if len(sys.argv[1:]):
  collection.insert(
    {
      "key" : "value",
    }
  )

# Get a tailable cursor for our looping fun
cursor = collection.find( {},
                          await_data=True,
                          tailable=True )

# This will catch ctrl-c and the error thrown if
# the collection is deleted while this script is
# running.
try:

  # The cursor should remain alive, but if there
  # is nothing in the collection, it dies after the
  # first loop. Adding a single record will
  # keep the cursor alive forever as I expected.
  while cursor.alive:
    print "Top of the loop"
    try:
      message = cursor.next()
      print message
    except StopIteration:
      print "MongoDB, why you no block on read?!"
      time.sleep(1)

except pymongo.errors.OperationFailure:
  print "Delete the collection while running to see this."

except KeyboardInterrupt:
  print "trl-C Ya!"
  sys.exit(0)

print "and we're out"

# End

So if you look at the code, it is pretty simple to demonstrate the issue I'm having. When I run the code against an empty collection (properly capped and ready for tailing), the cursor dies and my code exits after one loop. Adding a first record in the collection makes it behave the way I'd expect a tailing cursor to behave.

Also, what is the deal with the StopIteration exception killing the cursor.next() waiting on data? Why can't the backend just block until data becomes available? I assumed the await_data would actually do something, but it only seems to keep the connection waiting a second or two longer than without it.

Most of the examples on the net show putting a second While True loop around the cursor.alive loop, but then when the script tails an empty collection, the loop just spins and spins wasting CPU time for nothing. I really don't want to put in a single fake record just to avoid this issue on application startup.

Upvotes: 4

Views: 3433

Answers (1)

dcrosta
dcrosta

Reputation: 26258

This is known behavior, and the 2 loops "solution" is the accepted practice to work around this case. In the case that the collection is empty, rather than immediately retrying and entering a tight loop as you suggest, you can sleep for a short time (especially if you expect that there will soon be data to tail).

Upvotes: 1

Related Questions