henry
henry

Reputation: 607

count() is slow operation in MongoDB, how to speed up count() specifically?

I have a collection called ParseRequest. It is Shaded with sharing key _id. Probably not the best choice but right now I do not think it matters as the Collection only has 40,000 documents. There are two properties on ParseRequest collection that I am concerned with in this case: processed (Boolean) and parsed (Boolean).

I need to run this query and I want it to be lightning fast:

db. ParseRequest.count({processed: true, parsed: true})

So I tried two different ways:

  1. Have separate composite key on processed and parsed
  2. Include processed and parsed into shard key

Both ways improve performance but not enough, the count() above runs in 2-3 seconds or so, but I need much faster than that.

What is noteworthy, this query returns in no time (few milliseconds):

db. ParseRequest.find({processed: true, parsed: true}).limit(5)

But

db. ParseRequest.count({processed: true, parsed: true})

is still slow in either setup.

Is there anything else I should try?

Departing from this specific example above, it looks like in general count() for a specific criteria is very expensive operation in MongoDB. Even if you have an index it is still slow to do count; way slower than getting first few rows for the same criteria. Is there any reason for that?

I am coming from SQL Server background: calculating count(*) was always a fast thing in SQL Server. Calculating count is important to my app, and my frustration with MongoDD has grown so much that I am considering dumping MongoDB for that reason alone: slow to calculate count() for a specific criteria. But before I do that I want to be sure that I exhausted all possible ways to improve count calculation. Any suggestion is appreciated. Thank you.

-=-=-

Edit after firts few comments:

I use v2.2.6 running on Centos (64 bit)

Yes, explain says the idex is used and by the way without index it is even slower.

Yes I undestand that to calculate count() for a specific criteria the entire index tree needs to be scanned, but please excuse my comarison to SQL Server, in SQL Server it is also a complete index and yet somehow count(*) is faster everything else being equal. So is there any trick I can use in MongoDB?

Upvotes: 3

Views: 2502

Answers (1)

John Petrone
John Petrone

Reputation: 27515

I would try upgrading to at least version 2.4.x. There was a performance fix for count() that was released in 2.3.2. https://jira.mongodb.org/browse/SERVER-1752

Upvotes: 3

Related Questions