thenubus
thenubus

Reputation: 169

Mongodb query performance extremly bad with $or

I use MongoDB 3.2.13 and I have a collection C with about 500K documents. The documents contain a String field A which can only a handful of different values. If I query the collection for specific values like

db.getCollection('C').count({'A':{'$eq': 'valueA'}}) 
db.getCollection('C').count({'A':{'$eq': 'valueB'}}) 

I get results in under 1 second. If i do a combined search using $or

 db.getCollection('C').count({'$or':
       [
       {'A':{'$eq': 'valueA'}},
        {'A':{'$eq': 'valueB'}}
       ]
 }) 

the query runs about 165 seconds. I have an index an A. I can't explain why the $or query is so much slower? This is just a simple example that could be solved by just adding the results of the first two queries but we have other queries that can not easily be split up.

Can anyone explain what is wrong the $or query? Or give me a hint how to speed it up?

db.getCollection('C').find(query).explain() gives:

 {
     "queryPlanner" : {
         "plannerVersion" : 1,
         "namespace" : "db.C",
         "indexFilterSet" : false,
         "parsedQuery" : {
             "$or" : [ 
                 {
                     "A" : {
                         "$eq" : "valueA"
                     }
                 }, 
                 {
                     "A" : {
                         "$eq" : "valueB"
                     }
                 }
             ]
         },
         "winningPlan" : {
             "stage" : "SUBPLAN",
             "inputStage" : {
                 "stage" : "FETCH",
                 "inputStage" : {
                     "stage" : "IXSCAN",
                     "keyPattern" : {
                         "A" : 1
                     },
                     "indexName" : "A",
                     "isMultiKey" : false,
                     "isUnique" : false,
                     "isSparse" : true,
                     "isPartial" : false,
                     "indexVersion" : 1,
                     "direction" : "forward",
                     "indexBounds" : {
                         "A" : [ 
                             "[\"valueA\", \"valueA\"]", 
                             "[\"valueB\", \"valueB\"]"
                         ]
                     }
                 }
             }
         },
         "rejectedPlans" : []
     },
     "executionStats" : {
         "executionSuccess" : true,
         "nReturned" : 2513596,
         "executionTimeMillis" : 133764,
         "totalKeysExamined" : 2513597,
         "totalDocsExamined" : 2513596,
         "executionStages" : {
             "stage" : "SUBPLAN",
             "nReturned" : 2513596,
             "executionTimeMillisEstimate" : 131660,
             "works" : 2513597,
             "advanced" : 2513596,
             "needTime" : 0,
             "needYield" : 0,
             "saveState" : 20912,
             "restoreState" : 20912,
             "isEOF" : 1,
             "invalidates" : 0,
             "inputStage" : {
                 "stage" : "FETCH",
                 "nReturned" : 2513596,
                 "executionTimeMillisEstimate" : 131490,
                 "works" : 2513597,
                 "advanced" : 2513596,
                 "needTime" : 0,
                 "needYield" : 0,
                 "saveState" : 20912,
                 "restoreState" : 20912,
                 "isEOF" : 1,
                 "invalidates" : 0,
                 "docsExamined" : 2513596,
                 "alreadyHasObj" : 0,
                 "inputStage" : {
                     "stage" : "IXSCAN",
                     "nReturned" : 2513596,
                     "executionTimeMillisEstimate" : 4420,
                     "works" : 2513597,
                     "advanced" : 2513596,
                     "needTime" : 0,
                     "needYield" : 0,
                     "saveState" : 20912,
                     "restoreState" : 20912,
                     "isEOF" : 1,
                     "invalidates" : 0,
                     "keyPattern" : {
                         "A" : 1
                     },
                     "indexName" : "A_1",
                     "isMultiKey" : false,
                     "isUnique" : false,
                     "isSparse" : true,
                     "isPartial" : false,
                     "indexVersion" : 1,
                     "direction" : "forward",
                     "indexBounds" : {
                         "A" : [ 
                             "[\"valueA\", \"valueA\"]", 
                             "[\"valueB\", \"valueB\"]"
                         ]
                     },
                     "keysExamined" : 2513597,
                     "dupsTested" : 0,
                     "dupsDropped" : 0,
                     "seenInvalidated" : 0
                 }
             }
         },
         "allPlansExecution" : []
     },
     "serverInfo" : {
         "host" : "xxxx",
         "port" : 27017,
         "version" : "3.2.13",
         "gitVersion" : "23899209cad60aaafe114f6aea6cb83025ff51bc"
     },
     "ok" : 1.0
 }

Upvotes: 2

Views: 887

Answers (2)

thenubus
thenubus

Reputation: 169

After searching and searching i found the reason. One of the values i was search for had null values. Null is not covered by the index so every document was read from disk to check if is ha null or not.

It's also mentioned in the Mongo University Workshops. I can strongly recommend taking the classes. They helped me a lot!

Upvotes: 0

mgyongyosi
mgyongyosi

Reputation: 2677

Use $in instead of $or (Mongo $or vs $in docs):

db.getCollection('C').count({'A':{'$in': ['valueA', 'valueB']}}) 

UPDATE

Or try with a find(..).count() like this:

db.getCollection('C').find({'A':{'$in': ['valueA', 'valueB']}}).count()

Upvotes: 1

Related Questions