Alex Gordon
Alex Gordon

Reputation: 60691

Inserting data into mongodb using python

Should I be re-initializing the connection on every insert?

class TwitterStream:
  def __init__(self, timeout=False):
  while True:
    dump_data()

  def dump_data:
    ##dump my data into mongodb    
    ##should I be doing this every time??:
    client=MongoClient()
    mongo=MongoClient('localhost',27017)
    db=mongo.test
    db.insert('some stuff':'other stuff')
    ##dump data and close connection
    #########################

Do I need to open the connection every time I write a record? Or can I leave a connection open assuming I'll be writing to the database 5 times per second with about 10kb each time?

If just one connection is enough, where should I define the variables which hold the connection (client, mongo, db)?

Upvotes: 1

Views: 5566

Answers (2)

Ramiro Berrelleza
Ramiro Berrelleza

Reputation: 2405

Opening connections is in general an expensive operation, so I recommend you to reuse them as much as possible.

In the case of MongoClient, you should be able to leave the connection open and keep reusing it. However, as the connection lives on for a longer time, eventually you'll start hitting connectivity issues. The recommended solution for this it to configure MongoClient to use auto-reconnect, and catch the AutoReconnect exception as part of your retry mechanisms.

Here's an example of said approach, taken from http://python.dzone.com/articles/save-monkey-reliably-writing:

while True:
    time.sleep(1)
    data = {
        'time': datetime.datetime.utcnow(),
        'oxygen': random.random()
    }

    # Try for five minutes to recover from a failed primary
    for i in range(60):
        try:
            mabel_db.breaths.insert(data, safe=True)
            print 'wrote'
            break # Exit the retry loop
        except pymongo.errors.AutoReconnect, e:
            print 'Warning', e
            time.sleep(5)

Upvotes: 0

A. Jesse Jiryu Davis
A. Jesse Jiryu Davis

Reputation: 24009

Open one MongoClient that lives for the duration of your program:

client = MongoClient()

class TwitterStream:
    def dump_data:
        while True:
            db = client.test
            db.insert({'some stuff': 'other stuff'})

Opening a single MongoClient means you only pay its startup cost once, and its connection-pooling will minimize the cost of opening new connections.

If you're concerned about surviving occasional network issues, wrap your operations in an exception block:

try:
    db.insert(...)
except pymongo.errors.ConnectionFailure:
    # Handle error.
    ...

Upvotes: 1

Related Questions