piglet
piglet

Reputation: 93

Data not appearing in MongoDB after bulk insert?

I'm currently working with MongoDB running inside Azure CosmosDB to serve as a data store for an API. For my use case, I process a 30MB .csv file with approx. 180000 rows once per day. Due to this project still being in the proof-of-concept stage, my CosmosDB instance is limited to a maximum of 4000 RUs/s. I also set the shard key to be the date corresponding to the insert.

I have distributed each 'batch' of data to be 100 documents each, sending each second. This process runs overnight so a long insert time not a concern. It takes roughly an hour to insert all the documents. I performed a dry-run today and was monitoring the documents being inserted through MongoDB Compass. As the process was running, I could see more and more documents were getting added to the collection as I refreshed, but once the process finished, both Compass and the CosmosDB Explorer reported there were no documents whatsoever in the collection. My question is, what could have happened here? As far as I know, I didn't experience rate limiting of any kind, and have still been charged for the insert operation.

I process and load the batches with Python and pymongo. I have except blocks set up to catch BulkWriteErrors, but no errors were reported by the script. The code for loading is as follows:

for n in range(0, len(records), 100):
    time.sleep(INTERVAL)
    batch_slice = records[n:n+100]
    batch_reqs = [
        pymongo.InsertOne(row)
        for row in batch_slice
    ]

    try:
        collection.bulk_write(batch_reqs, ordered=False)
    except BulkWriteError as e:
        ...

Upvotes: 0

Views: 164

Answers (1)

piglet
piglet

Reputation: 93

For those who may inadvertently find this page, my issue was that I had set a time-to-live of 128 seconds on all my documents in the collection through Terraform, so they were being deleted 128 seconds after being inserted. Changing the value to -1 fixed the issue.

Kind of embarrassing I didn't spot this straight away, but it's one of those things that can be easily missed, I guess.

Upvotes: 1

Related Questions