simple liquids
simple liquids

Reputation: 165

Retrieve Multiple Documents by Specifying a Compound Key for Each

I have a collection of documents that has a compound key of {_id:{id:x, version:x}}:

{'_id':{'id':1, 'version':1}}
{'_id':{'id':1, 'version':2}}
{'_id':{'id':1, 'version':3}}
{'_id':{'id':2, 'version':1}}
{'_id':{'id':3, 'version':1}}
{'_id':{'id':4, 'version':1}}
{'_id':{'id':4, 'version':2}}
{'_id':{'id':4, 'version':3}}
{'_id':{'id':5, 'version':1}}

I want to write a query that retrieves multiple documents, in sql I would do something like

WHERE (id,version) IN (1,1),(1,2),(3,1),(4,1)

How can I do this in mongo?

Upvotes: 1

Views: 313

Answers (2)

simple liquids
simple liquids

Reputation: 165

So basically, what I've found is mongo does not have an efficient way to get results that have a compound index. $or is not performant and seems just crashes when you have a large amount of ors in the statment >1000. In the end, I've just decided to use an {'_id.id':{'$in':[1,2,3,4....etc.]}} which is very fast but returns extra documents, and then filter down after. For my case, this should work well enough even though its not ideal.

Another approach would be to use a different index key which combines the indexes I have into a 64 bit value, where I assign the first X bits to the id and the second x bits to the version. Doing this would allow me to quickly get the exact values I want. The main reason against this approach was I thought keeping similar documents on the same shard would be beneficial, as I can shard against part of the compound index that is the same. In the end, I think this second approach is the better approach. IE, don't use a compound index in mongo.

Upvotes: 0

Neil Lunn
Neil Lunn

Reputation: 151122

You would actually use an $or expression.

It depends on how your source data is stored. So for an array of "objects"

var params = [{ id: 1, version: 1 },{ id: 2, version: 1 },{ id: 3, version: 2 }];

db.collection.find({
  "$or": map.params( p =>
    Object.keys(p).map( k => ({ [`_id.${k}`]: p[k] }) )
      .reduce((acc,curr) => Object.assign(acc,curr), {})
  )
})

Or if you list have an "array of arrays" like in your SQL:

var params = [[1,1],[2,1],[3,2]];

db.collection.find({
  "$or": params.map(([id,version]) => ({ '_id.id': id, '_id.version': version }) )
})

Which should give you a bit more perspective on the sort of transformation happening here.

Basically it's about the $or and the "dot notation" to denote the compound keys. But I'm showing you a little transform here because with MongoDB you're typically working with "data structures", so you want to manipulate those instead of writing out a statement like SQL does.

Without the "mapping" that basically looks like:

db.collection.find({
  "$or": [
    { "_id.id": 1, "_id.version": 1 },
    { "_id.id": 2, "_id.version": 1 },
    { "_id.id": 3, "_id.version": 2 }
  ]
})

Which means that "any" of those listed conditions are going to possibly match a document ( or more than one ) and return them.

So the example is just the JavaScript way since that's what the provided "shell" allows. But you can essentially adapt the approach to your language of choice.


Of course you set this up in your MongoDB shell first with:

db.collection.insertMany([
  {'_id':{'id':1, 'version':1}},
  {'_id':{'id':1, 'version':2}},
  {'_id':{'id':1, 'version':3}},
  {'_id':{'id':2, 'version':1}},
  {'_id':{'id':3, 'version':1}},
  {'_id':{'id':4, 'version':1}},
  {'_id':{'id':4, 'version':2}},
  {'_id':{'id':4, 'version':3}},
  {'_id':{'id':5, 'version':1}}
])

Upvotes: 1

Related Questions