Reputation: 10907
I noticed a strange behavior. It might be mongoid or mongodb, I am not sure, but Counting documents is slower than fetching the documents. Here are the queries I fired:
Institution.all.any_of(:portaled_at.ne => nil).any_of(portaled: true).order_by(:portaled_at.desc).count
# mongodb query and timing as per mongoid logs,
# times are consistent over multiple runs
# MONGODB (236ms) db['$cmd'].find({"count"=>"institutions", "query"=>{"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}, "fields"=>nil}).limit(-1)
# MONGODB (245ms) db['$cmd'].find({"count"=>"institutions", "query"=>{"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}, "fields"=>nil}).limit(-1)
Institution.all.any_of(:portaled_at.ne => nil).any_of(portaled: true).order_by(:portaled_at.desc).to_a
# mongodb query and timing as per mongoid logs
# times are not so consistent over multiple runs,
# but consistently much lower than count query
# MONGODB (9ms) db['institutions'].find({"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}).sort([[:portaled_at, :desc]])
# MONGODB (18ms) db['institutions'].find({"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}).sort([[:portaled_at, :desc]])
I believe indexes are not used by mongodb for $and
and $or
queries, but just so if it matters, I have a sparse index on portaled_at
in descending order. Out of around 200,000 documents only around 50-60 have portaled_at set.
This is against my common sense and if anybody can explain what is going on I would really appreciate it.
Upvotes: 1
Views: 1315
Reputation: 9857
For some reason Mongo defaults to not counting records using only indexes. However, if you construct a query correctly, Mongo will count from the index. The trick is to only fetch the fields that are in the index, and you have to specify a query.
In Mongo Shell:
db.MyCollection.find({"_id":{$ne:''}},{"_id":1}).count()
You can check with the explain method:
db.MyCollection.find({"_id":{$ne:''}},{"_id":1}).explain()
Which will include "indexOnly" : true
in the output.
And similarly the command can be executed via the Moped driver directly like so:
Mongoid::Sessions.default.command(:count => "MyCollection", :query=>{"_id"=>{"$ne"=>""}}, :fields => {:_id=>1})
Which, in my benchmarks (on my live data, YMMV) is about 100x faster than simply doing MyMongoidDocumentClass.count
Unfortunately, there doesn't seem to be a way to do this quickly through the Mongoid gem.
Upvotes: 0
Reputation: 59763
While the two are running through different subsystems in MongoDB (one is using runCommand and the other the standard query engine), the specific issue in this case is very likely a known issue in the current version of MongoDb.
The quick summary is that counting without fetching is extremely slow as MongoDb is doing a lot of extra work that often isn't necessary. It's been fixed in the development branch, so it should be in 2.4 when it is released.
Upvotes: 4