lovesh
lovesh

Reputation: 5411

Connection reset by peer error in MongoDb on bulk insert

I am trying to insert 500 documents by doing a bulk insert in pymongo and i get this error

File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 306, in insert
    continue_on_error, self.__uuid_subtype), safe)
  File "/usr/lib64/python2.6/site-packages/pymongo/connection.py", line 748, in _send_message
    raise AutoReconnect(str(e))
pymongo.errors.AutoReconnect: [Errno 104] Connection reset by peer

i looked around and found here that this happens because the size of inserted documents exceeds 16 MB so according to that the size of 500 documents should be over 16 MB. So i checked the size of the size of the 500 documents(python dictionaries) like this

size=0
for dict in dicts:
    size+=dict.__sizeof__()
print size

this gives me 502920. This is like 500 KB. way less than 16 MB. Then why do i get this error. I know i am calculating the size of python dictionaries not BSON documents and MongoDB takes in BSON documents but that cant turn 500KB into 16+ MB. Moreover i dont know how to convert a python dict into A BSON document.

My MongoDB version is 2.0.6 and pymongo version is 2.2.1

EDIT I can do a bulk insert with 150 documents and thats fine but over 150 documents this error appears

Upvotes: 3

Views: 3244

Answers (3)

Carst
Carst

Reputation: 1614

Just had the same error and got around it by creating my own small bulks like this:

region_list = []
region_counter = 0
write_buffer = 1000
# loop through regions
for region in source_db.region.find({}, region_column):
    region_counter += 1 # up _counter
    region_list.append(region)
    # save bulk if we're at the write buffer
    if region_counter == write_buffer:
        result = user_db.region.insert(region_list)
        region_list = []
        region_counter = 0
# if there is a rest, also save that
if region_counter > 0:
    result = user_db.region.insert(region_list)

Hope this helps

NB: small update, from pymongo 2.6 on, PyMongo will auto-split lists based on the max transfer size: "The insert() method automatically splits large batches of documents into multiple insert messages based on max_message_size"

Upvotes: 0

earthmeLon
earthmeLon

Reputation: 638

This Bulk Inserts bug has been resolved, but you may need to update your pymongo version:

pip install --upgrade pymongo

Upvotes: 1

lovesh
lovesh

Reputation: 5411

The error occurs due to the fact that the bulk inserted documents have an overall size of greater than 16 MB

My method of calculating the size of dictionaries was wrong.

When i manually inspected each key of the dictionary and found that 1 key was having a value of size 300 KB. So that did make the overall size of documents in the bulk insert more than 16 MB. (500*(300+)KB) > 16 MB. But i still dont know how to calculate size of a dictionary without manually inspecting it. Can someone please suggest?

Upvotes: 0

Related Questions