bork
bork

Reputation: 1674

Why do pymongo generate a duplicate _id value when using insert_one for different objects?

I'm trying to save objects to mongodb using pymongo. I have no issues with the first object, but when trying to save a second object I get pymongo.errors.DuplicateKeyError: E11000 duplicate key error collection: km_tracker.entries index: _id_ dup key: { : ObjectId('5b8ce80ebb822e06c8ecf1c7') }

My save function:

def save_entries(entries):
    entries['save_date'] = str(datetime.datetime.now())
    db.entries.insert_one(entries)

The traceback:

Traceback (most recent call last):
  File "app.py", line 182, in <module>
    main();
  File "app.py", line 22, in main
    new_entry()
  File "app.py", line 77, in new_entry
    review_information(entries)
  File "app.py", line 178, in review_information
    save_entries(entries)
  File "app.py", line 93, in save_entries
    db.entries.insert_one(entries)
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\collection.py", line 693, in insert_one
    session=session),
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\collection.py", line 607, in _insert
    bypass_doc_val, session)
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\collection.py", line 595, in _insert_one
    acknowledged, _insert_command, session)
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\mongo_client.py", line 1243, in _retryable_write
    return self._retry_with_session(retryable, func, s, None)
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\mongo_client.py", line 1196, in _retry_with_session
    return func(session, sock_info, retryable)
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\collection.py", line 592, in _insert_command
    _check_write_command_response(result)
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\helpers.py", line 217, in _check_write_command_response
    _raise_last_write_error(write_errors)
  File "C:\Users\someuser\AppData\Local\Programs\Python\Python37\lib\site-packages\pymongo\helpers.py", line 198, in _raise_last_write_error
    raise DuplicateKeyError(error.get("errmsg"), 11000, error)
pymongo.errors.DuplicateKeyError: E11000 duplicate key error collection: km_tracker.entries index: _id_ dup key: { : ObjectId('5b8ce6adbb822e40d431d444') }

The first object, which is successfully saved to the database:

{
"_id" : ObjectId("5b8ce6adbb822e40d431d444"),
"reg_number" : "dfg",
"date" : "dfg",
"b_meter_indication" : "dfg",
"end_meter_indication" : "dfg",
"trip" : "dfg",
"start_address" : "dfg",
"stop_address" : "dfg",
"reason" : "dfg",
"driver" : "dfg",
"other" : "dfg",
"save_date" : "2018-09-03 09:45:49.340871"
}

The second object, which is not saved due to the duplicate key:

{'_id': ObjectId('5b8ce6adbb822e40d431d444'),
 'b_meter_indication': 'rty',
 'date': 'rty',
 'driver': 'rty',
 'end_meter_indication': 'rty',
 'other': 'rty',
 'reason': 'rty',
 'reg_number': 'try',
 'save_date': '2018-09-03 09:46:02.246101',
 'start_address': 'rty',
 'stop_address': 'rty',
 'trip': 'rty'
}

As I'm not explicitly defining the value of _id, but letting pymongo do this for me, I don't understand why it would assign the previous value of _id to my current object. Could it be so that pymongo, for some reason in this case, think that the second object is the same as first one, thus giving it the same _id value?

Python version: 3.7.0 Mongodb version 4.0 PyMongo version: 3.7.1

Edit: Added functions using the save_entries() function

def edit_entry(entries):
    print("Editing: {}".format(entries))
    entry = input()
    return entry

def review_information(entries):
    print("Do you wish to edit something? (y/n)")
    while edit != False or edit != False:
        edit = input()

        if edit == "Y" or edit == "y":
            edit_entry(entries)
        elif edit == "N" or edit == "n":
            break
        else:
            print("Please provide a valid input")
            continue
    save_entries(entries)

Upvotes: 3

Views: 3736

Answers (2)

Fanco
Fanco

Reputation: 54

I have the same problem as well but I fix it now.

In my case, I creat a new dict in the for loop. In the loop judge condition first, if the date is what I need, I make insert_one to MongoDB database, but Pymongo think it is a same object even though there are totally different in document, because the dict has same memeory id so it didn`t creat a new "_id". So I need to put the dict from loop place to the if place.

for itemin items:
i = 1
while i < len(item['a']):
    if (item['a'][i-1]['date'] < item['a'][i]['date']:
        temp= {}     #this one
        ...
        db.collection.insert_one(temp)
    i +=1

Upvotes: 1

Wan B.
Wan B.

Reputation: 18845

I'm not explicitly defining the value of _id, but letting pymongo do this for me

This is because when a document is inserted to MongoDB using insert_one(), insert_many(), or bulk_write(), and that document does not include an _id field, PyMongo automatically adds one for you, set to an instance of ObjectId.

See also FAQ: Why does PyMongo add an _id field to all of my documents?

I have no issues with the first object, but when trying to save a second object I get pymongo.errors.DuplicateKeyError

Instead of having explicit pass by value and pass by reference semantics, Python passes value by name. With Lists and Dicts being mutable objects, and Numbers, Strings, and Tuples being immutable objects.

In your case, essentially you are passing the same entries object to save_entries() function. PyMongo insert_one() modified the object by adding _id field, which then you tried to save again. The second save however already contains _id field, causing the duplicate error.

There are few ways to handle this case:

  • Explicitly copy() the object before passing to save_entries().
  • Delete the _id key from the object after each call to save_entries(). i.e. del entries["_id"].

Upvotes: 4

Related Questions