Why MongoDB db.col.count() shows more documents than was inserted

Question

Using Java driver for MongoDB I was trying to insert 25,637,015 documents into MongoDB cluster. The documents were retrieved from SQL Server database and were inserted into initially empty MongoDB sharded collection (called col) in multithreaded fashion (8 concurrent threads). The process took 2 hours. What is interesting and puzzling is that something went on for over 6(!) hours AFTER the program has finished.

Firstly, hard drives in my cluster node computers continued to spin like crazy. Secondly, and more importantly, db.col.count() that ran with a less than second interval continued to render different results:

mongos> db.col.count() 
25694898
mongos> db.col.count()
25694917
mongos> db.col.count()
25695154
mongos> db.col.count()
25695207
mongos> db.col.count()
25695422
mongos> db.col.count()
25695493
mongos> db.col.count()
25696024
mongos> db.col.count()
25696130
mongos> db.col.count()
25698565
mongos> db.col.count()
25695145

What is even more intriguing all these counters while going up and down were greater than number of inserted documents: 25,637,015. Had they been smaller I could speculate that the documents went to some sort of queue and are being slowly processes, but greater?!

Like I said after six hours it all stabilized: the hard drives stopped spinning and mongos> db.col.count() has finally rendered correct number: 25637015.

If it is of any importance. I have 2 replica sets in my sharded cluster. Each replica set has 2 data nodes and 1 arbiter only node. I run 3 config servers. And 3 mongos. All spread between 4 Centos boxes (virtual) running on Windows hosts. Source SQL Server is on yet another physical machine. Balancer was not disabled for the duration of insert or anytime after. My MongoDB version is 2.2.6 64 bit.

Any idea what MongoDB was doing for six hours after Java program has finished inserting? Why count was so high?

Thank you

Why MongoDB db.col.count() shows more documents than was inserted

Answers (1)

Related Questions