lovesh
lovesh

Reputation: 5411

mongoDB unique indexes not working

I have a lot of documents which i indexed by the url. I created an index on url from pymongo like this

coll.create_index('url',unique=True,background=True)      //coll is the name of the collection

but still i am able to insert documents with duplicate urls.

i checked from the mongo shell whether the index actually exists and it shows this

{
        "v" : 1,
        "key" : {
            "url" : 1
        },
        "ns" : "dbname.coll",
        "name" : "url_1",
        "background" : true
},

Does setting background=True also mean that the uniqueness of url wont be checked at the instant when the document is inserted? I am totally confused why the uniqueness is not working?

Upvotes: 1

Views: 4000

Answers (2)

Eren Güven
Eren Güven

Reputation: 2374

First off, db.collection.getIndexSpecs() should report "unique" : true.

In terms of inserting, my guess is, that you are doing unsafe writes. Unsafe save that is violating a unique index constraint actually returns a new ObjectId but the document in database is not changed nor a new document is saved.

You'll get a pymongo.errors.DuplicateKeyError if you try the operation with safe=True.

Upvotes: 3

stbrody
stbrody

Reputation: 1856

Building a unique index with dropDups=True will fail if it encounters more than 1 million documents that need to be dropped due to having duplicate keys for the index. If you can manually delete some duplicates to make the total duplicate count be below 1 million, then you should be able to build the index successfully.

Another option if you can't get the number of duplicates below 1 million is to take a secondary offline and bring it up as a stand-alone node, not part of the replica set. Then you can dump the collection that needs the new index using mongodump -d -c , then drop the collection, create the index, then restore the data using mongorestore. Then that secondary will have the same documents, but with duplicates removed and the index built. Then you can put that secondary back into the replica set and repeat the process on another secondary. Finally you can step down you primary and do the same on that node one a new primary has been elected.

Upvotes: 0

Related Questions