Imran Omar Bukhsh
Imran Omar Bukhsh

Reputation: 8071

Jumbled up ids in mongodb

We have around 20 million records in our mongodb. In my collection called 'posts' there is a field called 'id' which was supposed to be unique but now it has gotten all messed up. We just want it to be unique and there are many many duplicates now.

We just wanted to do something like iterating over every reocrd and assigning it a unique id in a loop from 1 to 20million.

What would be the easiest way to do this?

Upvotes: 0

Views: 159

Answers (2)

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230346

There are not many options here, really.

  1. Pick your language and driver of choice.

  2. Fetch N documents.

  3. Assign unique ids to them (several options here: 1) copy _id; 2) assign new ObjectID; 3) assign plain integer)

  4. Save those documents.

  5. Fetch next N documents. Go to step 3.

To fetch next N documents, you should note the last processed document's _id and do this:

db.collection.find({_id: {$gt: last_processed_id}}).sort({_id: 1}).limit(N);

Do not use skip here. It will be too slow.

And, of course, you can always truncate the collection, create unique index on id and populate it again.

Upvotes: 2

AD7six
AD7six

Reputation: 66237

You can use a simple script like this:

db.posts.dropIndex("*id index name here*"); // Drop Unique index

counter = 0;

page = 1;
slice = 1000;

total = db.posts.count();
conditions = {}; 

while (counter < total) {

    cursor = db.posts.find(conditions, {_id: true}).sort({_id: 1}).limit(slice);

    while (cursor.hasNext()) {
        row = cursor.next();
        db.posts.update({_id: row._id}, {$set: {id: ++counter}});
    }   

    conditions['_id'] = {$gt: row._id};
    print("Processed " + counter + " rows");
}   

print('Adding id index');
db.posts.ensureIndex({id: 1}, {unique: true, background: false});

print("done");                 

save it to assignids.js, and run as

$ mongo dbname assignids.js

the outer-while selects 1000 rows as a time, and prevents cursor timeouts; the inner while assigns each row a new incremental id.

Upvotes: 1

Related Questions