Harsh Kishore
Harsh Kishore

Reputation: 11

How to bulk delete (say millions) of documents spread across millions of logical partitions in Cosmos db sql api?

MS Azure documentation does not talk anything about it. Formal bulk executor documentations talks only about insert and update options, not delete. There is a suggested java script server side program to create a stored procedure which sounds very good, but that requires us to input the partition key value. It wont make sense if our documents are spread across millions of logical partitions.

This is a very simple business need. While migrating huge volume of data in a sql api cosmos collection, if we insert some wrong data, there seems to be no option to delete other then restore to previous state. I have explored for few hrs now, but couldnt find a solution. Even raised a case with MS support, they directed to some .net code which I see need to see as that does not look straightforward. What if someone dont know .net.

Cant we easily bulk delete docs spread across several logical partitions in MS Cosmos SQL API ? Feels disgusting ..

I hope you can provide some accurate details. How to achieve this with some simple straight forward sample code and steps as well. Hope MS and Cosmos db experts to share views as well.

Upvotes: 0

Views: 4491

Answers (3)

Jay Gong
Jay Gong

Reputation: 23782

Even raised a case with MS support, they directed to some .net code which I see need to see as that does not look straightforward.

Obviously,you have already made some efforts to find any solutions except below 2 scenarios:

  1. Bulk delete Stored procedure:https://github.com/Azure/azure-cosmosdb-js-server/blob/master/samples/stored-procedures/bulkDelete.js

  2. Bulk delete executor:

    .NET: https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started/blob/master/BulkDeleteSample/BulkDeleteSample/Program.cs

    Java: https://github.com/Azure/azure-cosmosdb-bulkexecutor-java-getting-started/blob/master/samples/bulkexecutor-sample/src/main/java/com/microsoft/azure/cosmosdb/bulkexecutor/bulkdelete/BulkDeleter.java

So far, only above official solutions are supported. Another workaround is TTL for cosmos db.I believe you have your own logic to judge which part of data is correct and which part of data is wrong,should be deleted. You could set TTL on those data so that they could be killed as soon as expired data arrivals.

Upvotes: 1

If you write a batch job to do that delete documents over night by using some date configuration we could achieve it. Here is the article published on how to do it.

https://medium.com/@vaibhav.medavarapu/bulk-delete-documents-from-azure-cosmos-db-using-asp-net-core-8bc95dd20411

Upvotes: 0

Harsh Kishore
Harsh Kishore

Reputation: 11

Has anyone tried this .. looks like a good solution in java https://github.com/Azure/azure-cosmosdb-bulkexecutor-java-getting-started#bulk-delete-api

Upvotes: 0

Related Questions