Teena George
Teena George

Reputation: 198

Mongo java driver collection.count() returns one extra always

I have a MongoDB replica set of 2 nodes.

A typical document in the collection looks like this:

{
    _id: 409,
    status: "active"
    address: [
        { id: 1000012, type: "primary", status: "active" },
        { id: 1000011, type: "primary", status: "inactive" },
        { id: 1000010, type: "primary", status: "inactive" }
    ],
}

When I use the Java MongoDB Driver to find the count of a collection, based on some simple filters, I always get 1 extra (e.g., if the actual count is 1299 the result is 1300):

db.collection.count({
    "status": "active",
    "address.type": "primary",
    "address.status": "active"
});

I read in the official documentation that collection.count(...) can return incorrect results in case of sharded collections, but mine is not sharded, it is just a replica set.

However, when I aggregate the same query and print the sum, it is always correct (1299):

db.collection.aggregate([
    { $unwind: "$address" },
    { $match: {
        "status": "active",
        "address.type": "primary",
        "address.status": "active",
    }},
    { $group: { _id: null, count: { $sum: 1 }}},
    { $project: { _id: 0, count: 1 }}
]);

What could be the reason for this behaviour?

This matches the aggregation:

db.collection.count({"address": {$elemMatch:{"status": "active", "type": "primary"}}, status: "active"});

Upvotes: 1

Views: 273

Answers (1)

Danziger
Danziger

Reputation: 21161

The first query:

db.collection.count({
    "status": "active",
    "address.type": "primary",
    "address.status": "active"
});

Is not calculating the same as the one with aggregation. This one is selecting all documents (not subdocuments) with status = "active" and ANY address with type = "primary" OR status = "active".

By your question and comments I assume you have 1300 documents that match that, but at least one of them doesn't match the address.type and address.status conditions in the same subdocument, thus returning different results when using $unwind with the aggregation framework, as in that case those two should match in the same subdocument.

WiredTiger issues after hard crash:

Just as a reference for others, another infrequent issue is a hard crash when using WiredTiger:

If you are using WiredTiger as your storage engine, the issue may be caused by a hard crash that led to an inconsistent state of db.stats upon recovery, which are not recalculated automatically after startup, even though the data was recovered successfully. To rebuild them, run db.collection.validate(true).

For more on this issues, see:

Upvotes: 1

Related Questions