Reputation: 4515
I encountered weird thing when using C# MongoDB CountDocumentAsync
function. I enabled query logging on MongoDB and this is what I got:
{
"op" : "command",
"ns" : "somenamespace",
"command" : {
"aggregate" : "reservations",
"pipeline" : [
{
"some_query_key": "query_value"
},
{
"$group" : {
"_id" : null,
"n" : {
"$sum" : 1
}
}
}
],
"cursor" : {}
},
"keyUpdates" : 0,
"writeConflicts" : 0,
"numYield" : 9,
"locks" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(24)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(12)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(12)
}
}
},
"responseLength" : 138,
"protocol" : "op_query",
"millis" : 2,
"execStats" : {},
"ts" : ISODate("2018-09-27T14:08:48.099Z"),
"client" : "172.17.0.1",
"allUsers" : [ ],
"user" : ""
}
simple count is converted into an aggregate.
More interestingly when I use CountAsync
function (which btw is marked obsolete with remark I should be using CountDocumentsAsync
) it produces:
{
"op" : "command",
"ns" : "somenamespace",
"command" : {
"count" : "reservations",
"query" : {
"query_key": "query_value"
}
},
"keyUpdates" : 0,
"writeConflicts" : 0,
"numYield" : 9,
"locks" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(20)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(10)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(10)
}
}
},
"responseLength" : 62,
"protocol" : "op_query",
"millis" : 2,
"execStats" : {
},
"ts" : ISODate("2018-09-27T13:58:27.758Z"),
"client" : "172.17.0.1",
"allUsers" : [ ],
"user" : ""
}
which is what I would expect. Does anyone know what might be a reason for this behavior? I browsed documentation but didn't find anything interesting regarding it.
Upvotes: 5
Views: 1773
Reputation: 563
This is the documented behaviour for drivers supporting 4.0 features. The reason for the change is to remove confusion and make it clear when an estimate is used and when it is not.
When counting based on a query filter (rather than just counting the entire collection) both methods will cause the server to iterate over matching documents to count them and therefore have similar performance.
From MongoDb docs: db.collection.count()
NOTE:
MongoDB drivers compatible with the 4.0 features deprecate their respective cursor and collection count() APIs in favor of new APIs for countDocuments() and estimatedDocumentCount(). For the specific API names for a given driver, see the driver documentation.
From MongoDb docs: db.collection.countDocuments()
db.collection.countDocuments(query, options)
New in version 4.0.3.
Returns the count of documents that match the query for a collection or view. The method wraps the $group aggregation stage with a $sum expression to perform the count and is available for use in Transactions.
A more detailed explanation for this change in API can be found on the MongoDb JIRA site:
Drivers supporting MongoDB 4.0 must deprecate the count() helper and add two new helpers - estimatedDocumentCount() and countDocuments(). Both helpers are supported with MongoDB 2.6+.
The names of the new helpers were chosen to make it clear how they behave and exactly what they do. The estimatedDocumentCount helper returns an estimate of the count of documents in the collection using collection metadata, rather than counting the documents or consulting an index. The countDocuments helper counts the documents that match the provided query filter using an aggregation pipeline.
The count() helper is deprecated. It has always been implemented using the count command. The behavior of the count command differs depending on the options passed to it and the topology in use and may or may not provide an accurate count. When no query filter is provided the count command provides an estimate using collection metadata. Even when provided with a query filter the count command can return inaccurate results with a sharded cluster if orphaned documents exist or if a chunk migration is in progress. The countDocuments helper avoids these sharded cluster problems entirely when used with MongoDB 3.6+, and when using
Primary
read preference with older sharded clusters.
Upvotes: 5