mongodb.countDocuments is slow when result set is large even if index is used

Question

mongodb.countDocuments is slow when result set is large

Test data on users collection:

10M documents with status 'active'
100k documents with status 'inactive'

The field status is indexed {status: 1}

db.users.countDocuments({status: 'active'}) takes 2.91 sec db.users.countDocuments({status: 'inactive'}) takes 0.018 sec

I understand that countDocuments uses an aggegation to find and count the results.

estimatedDocumentCount() does not work in this case because query filter is needed

Any suggestions for improvement?

klhr · Accepted Answer

Counting seems like one of those things that should be cheap, but often isn't. Because mongo doesn't maintain a count of the number of documents that match certain criteria in its b-tree index, it needs to scan through the index counting documents as it goes. That means that counting 100x the documents will take 100x the time, and this is roughly what we see here -- 0.018 * 100 = 1.8s.

To speed this up, you have a few options:

The active count is roughly estimatedDocumentCount() - db.users.countDocuments({status: 'inactive'}). Would this be accurate enough for your use case?
Alternatively, you can maintain a counts document in a separate collection that you keep in sync with the number of active/inactive documents that you have.

mongodb.countDocuments is slow when result set is large even if index is used

Answers (1)

Related Questions