ML_aj
ML_aj

Reputation: 11

MongoDB - Count does not match number of documents

I am trying to write a function that moves documents from collection_one to collection_two. I have run into an odd error where the counts don't add up.

collection_one.count({}) return 3.3 mil records

After moving all the documents, collection_two.count({}) return 3.2 mil.

Each document in collection_one contains a unique uuid run_id. When I run the following commands, these are the outputs:

collection_one.count({'run_id': { $eq : 'uuid'}}), I get 3.2 mil
collection_one.count({'run_id': { $ne : 'uuid'}}), I get 0;

Basically, there are 0.1 mil missing records, which only show up in the empty count.
I have tried moving the documents in a couple of different ways via pymongo and using copyTo() in the shell.

for doc in source.find():
    try:
        target.insert(doc)
    except:
        print('Did not copy')

and a batch move function

for n in range(0, ceil_num_of_batches):
    result = source.find(data_filter).limit(batch_size).skip(n*batch_size)
    insert_queries = [InsertOne(doc) for doc in result]
    try:
        target.bulk_write(insert_queries)
    except BulkWriteError as bwe:
        logger.error(bwe.details)

Both of these produce the same error. copyTo() however copies all 3.3 mil but has been deprecated.
Collection_two has a unique index but collection_one doesn't.

Upvotes: 1

Views: 957

Answers (1)

Joe Drumgoole
Joe Drumgoole

Reputation: 1348

You need to use countDocumentsocuments to get an accurate document count. Count uses internal metadata and may not always give an accurate result.

Upvotes: 1

Related Questions