Glory to Russia
Glory to Russia

Reputation: 18710

How to copy lot of entries in MongoDB with slight modifications (Java driver)?

Imagine I have a MongoDB collection with following fields:

  1. buildingID (String)
  2. projectID (String)
  3. coords (array of longitude/latitude coordinates)

I have lots of records, which are assigned - via the projectID property - to project A. Now I want to

  1. take all records belonging to project A,
  2. duplicate them such that
  3. in the new records, all fields except projectID are equal to the original ones and
  4. projectID is equal to project B.

I could do it like this:

Collection coll = getDb().getCollection("MyColl");

final Map<String,Object> query = new HashMap<>();
query.put("projectid", "projectA");

DBCursor cursor = coll.find(new BasicDBObject(query));

while (cursor.hasNext()) {
    final BasicDBObject curRecord = cursor.next();

    final BasicDBObject newRecord = clone(curRecord);
    newRecord.set("projectid", "projectB");
    coll.insert(newRecord);
}

clone creates a copy of curRecord?

Is there a more elegant way to do this? Can I avoid getting data out of MongoDB into Java and back into MongoDB?

Upvotes: 0

Views: 68

Answers (1)

Blakes Seven
Blakes Seven

Reputation: 50406

There sure is a more elegant way to do it. Use the Bulk Operations API, as this will reduce the number of writes and responses to the server considerably:

    BulkWriteOperation bulk = coll.initializeOrderedBulkOperation();
    Integer count = 0;

    DBCursor cursor = coll.find(new BasicDBObject("projectid", "projectA"));

    while (cursor.hasNext()) {
        DBObject curRecord = cursor.next();
        curRecord.removeField("_id");  // why bother with a clone when you can remove the _id
        curRecord.put("projectid","projectB"); // replace the projectid
        bulk.insert(curRecord);
        count++;

        if ( count % 1000 == 0 ) {
            bulk.execute();
            bulk = collection.initializeOrderedBulkOperation();
        }
    }

    if (count % 1000 != 0 )
        bulk.execute();

Now things are only sent/recieved from the server every 1000 operations. This is also an internal limit, but it helps to limit memory consumption to manage this yourself.

Upvotes: 1

Related Questions