mb925
mb925

Reputation: 137

Remove duplicated objects from MongoDB query results

I have an aggregation pipeline in mongo composed of a match and a project operation and I have an array of objects as a result.

An example of result:

[{"pdb_id":"1avy"},{"pdb_id":"1avy"},{"pdb_id":"1lwu"}]

I would like to remove duplicated objects, so the result should be:

[{"pdb_id":"1avy"},{"pdb_id":"1lwu"}]

An example of working solution is:

 const uniqueArray = result.filter((object,index) => index === result.findIndex(obj => JSON.stringify(obj) === JSON.stringify(object)));

But this is extremely slow when I have more data involved. Do you know a faster solution?

Please consider that the object in the result may also have more than one property. For example:

[{"pdb_id":"1avy", "pdb_chain":"A"},{"pdb_id":"1avy", "pdb_chain":"A"},{"pdb_id":"1lwu", "pdb_chain":"A"}]

Needs to be filtered to:

 [{"pdb_id":"1avy", "pdb_chain":"A"},{"pdb_id":"1lwu", "pdb_chain":"A"}]

Upvotes: 1

Views: 43

Answers (1)

Tom Slabbaert
Tom Slabbaert

Reputation: 22276

If you want it to do it with Mongo's help you don't have much choice other than using $group:

db.collection.aggregate([
  {
    $project: {
      _id: 0
    }
  },
  {
    $group: {
      _id: "$$ROOT"
    }
  },
  {
    "$replaceRoot": {
      "newRoot": "$_id"
    }
  }
])

Mongo Playground

Obviously Mongo has some limits, assuming you don't reach the 100 mb limit this should always be the faster option.

Upvotes: 1

Related Questions