Taha Hussain
Taha Hussain

Reputation: 21

Mongo index question - same field repeated multiple times in a query

I have a general question on how mongo db indexes work

Lets say I have an index on fieldA, fieldB, fieldC

Is the query

{fieldA : "A", fieldB: {"$gt" : 5},fieldB: {"$lt" : 10} , "fieldC" : "C}

same as

{fieldA : "A", fieldB: {"$gt" : 5,"$lt" : 10} , "fieldC" : "C}

when it comes to query performance?

Upvotes: 0

Views: 34

Answers (1)

user20042973
user20042973

Reputation: 5065

The answer to your general question is "yes, they are the same when it comes to performance". But there's a big caveat here first when it comes to semantics and what query is actually communicated to the database from the driver.

To address the latter first, consider what happens when you paste the first query (plus the missing closing quotation mark) into the MongoDB shell (which runs javascript):

test> var query = {fieldA : "A", fieldB: {"$gt" : 5},fieldB: {"$lt" : 10} , "fieldC" : "C"}

test> query
{ fieldA: 'A', fieldB: { '$lt': 10 }, fieldC: 'C' }

Some programming languages (and therefore driver implementations) use data structures that do not support duplicate field names. In these situations the behavior you get is undefined and usually results in just one of the values getting captured as demonstrated above.

The documentation talks about that here and notes that such a query requires using the $and operator explicitly rather than relying on the implicit one. Therefore we can rewrite the first query as follows:

test> var query1 = {fieldA : "A", $and:[{fieldB: {"$gt" : 5}},{fieldB: {"$lt" : 10}}] , "fieldC" : "C"}

test> query1
{
  fieldA: 'A',
  '$and': [ { fieldB: { '$gt': 5 } }, { fieldB: { '$lt': 10 } } ],
  fieldC: 'C'
}

Now going back to the original main question, there should be no difference when it comes to performance. You can see this using @ray's suggestion to inspect explain output. Indeed the queryPlanner.parsedQuery should be the same between the two:

test> query1
{
  fieldA: 'A',
  '$and': [ { fieldB: { '$gt': 5 } }, { fieldB: { '$lt': 10 } } ],
  fieldC: 'C'
}
test> query2
{ fieldA: 'A', fieldB: { '$gt': 5, '$lt': 10 }, fieldC: 'C' }
test>

test> db.foo.find(query1).explain().queryPlanner.parsedQuery
{
  '$and': [
    { fieldA: { '$eq': 'A' } },
    { fieldC: { '$eq': 'C' } },
    { fieldB: { '$lt': 10 } },
    { fieldB: { '$gt': 5 } }
  ]
}
test> db.foo.find(query2).explain().queryPlanner.parsedQuery
{
  '$and': [
    { fieldA: { '$eq': 'A' } },
    { fieldC: { '$eq': 'C' } },
    { fieldB: { '$lt': 10 } },
    { fieldB: { '$gt': 5 } }
  ]
}
test>

test> JSON.stringify(db.foo.find(query1).explain().queryPlanner.parsedQuery) == JSON.stringify(db.foo.find(query2).explain().queryPlanner.parsedQuery)
true

Once the query has been parsed it will be planned the same way meaning that query performance should be the same.

In my personal opinion, the second query is much more readable so I would prefer that.

Upvotes: 2

Related Questions