hansmaad
hansmaad

Reputation: 18915

How to delete many documents in a partitioned collection in Azure CosmosDB using MongoDB API

Consider the following document type

class Info
{
    public string Id { get; set; }
    public string UserId { get; set; }  // used as partition key
    public DateTime CreatedAt { get; set; }
}

I've created a collection using this

var bson = new BsonDocument
{
    { "shardCollection", "mydb.userInfo" },
    { "key", new BsonDocument(shardKey, "hashed") }
};
database.RunCommand(new BsonDocumentCommand<BsonDocument>(bson));

To delete all documents that are older than a certain date, I tried this

collection.DeleteManyAsync(t => t.CreatedAt >= date);

But this fails with Command delete failed: query in command must target a single shard key. My question is, how should I efficently delete these documents across multiple partitions? I'm not looking for answers how to choose the partition key in this case. I think that there will be always cases where I have to run modifiying queries across all partitions.

I could first query for documents with collection.Find(t => t.CreatedAt >= date) and then run a DeleteManyAsync(t => idsInThatPartition.Contains(t.Id) && t.UserId == thatPartitionKey) for each group of partition key, but I really hope that there is a better way. Example code:

var affectedPartitions = await collection.Aggregate()
    .Match(i => i.CreatedAt >= date)
    .Group(i => i.UserId, group => new { Key = group.Key })
    .ToListAsync();

foreach (var partition in affectedPartitions)
{
    await collection.DeleteManyAsync(
        i => i.CreatedAt >= date && i.UserId == partition.Key);
}

Upvotes: 0

Views: 2473

Answers (2)

Eliran Azulay
Eliran Azulay

Reputation: 231

I don't know about C# syntax specific but I managed to work around this issue with a MongoDB Bulk Operation.

this solution is far from perfect but is the only way I could think to solve this.

this is an example of how I implemented this on Nodejs:

//First find all your document you want to Update/Delete
const res = await model.find(query).lean().exec()

//Initialize bulk operation object
var bulk = model.collection.initializeUnorderedBulkOp();

//Iterate the results
res.forEach((item: any) => {

    //Find your document with your shared key ( my shared key is the document _id)
    bulk.find({ _id: item._id }).removeOne();
})

//Check if should excute the bulk operation
if (bulk.length > 0)
    //Execute all operations at once
    return await bulk.execute();

Reference to MongoDB bulk operation https://docs.mongodb.com/manual/reference/method/Bulk/

Upvotes: 1

Burgito
Burgito

Reputation: 86

I ran into the same problem and finally found that this is not currently possible, and that the Azure CosmosDb team is working on a solution, with a tentative to release in the firsts month of 2019

https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/34813063-cosmosdb-mongo-api-delete-many-with-partition-ke

Wait and see :(

Upvotes: 1

Related Questions