dr11
dr11

Reputation: 5756

Cleanup Cosmos collection by document type

I have a collection (1B records) and I need to clean it up

Schema:

// <pk> - item Id
// <type> - literal enum, e.g. Type1|Type2|Type3

{
  "partKey": "<pk>",
  "type": "<type>"
}

I need to delete all documents where type = Type2.

  1. I can't execute DELETE ... WHERE c.type = 'Type2' as it is not supported
  2. I can't execute Stored Procedure as the collection is partitioned
  3. I'd prefer not to use SDK

What is the best way to cleanup the collection by the specified condition?

Upvotes: 0

Views: 131

Answers (1)

Steve Johnson
Steve Johnson

Reputation: 8690

I create the following data for test in my collection:

[
    {
        "partKey": "1",
        "type": "1"
    },
    {
        "partKey": "5",
        "type": "4"
    },
    {
        "partKey": "2",
        "type": "2"
    },
    {
        "partKey": "3",
        "type": "2"
    },
    {
        "partKey": "4",
        "type": "2"
    }
]

Then create a dataflow in ADF. Both source and sink dataset is your Cosmos DB collection.

  1. Check Include system columns option in Source setting.

enter image description here

2.Create Alter Row transformation to delete documents.

enter image description here

  1. Check Allow delete option and type your Partition key. enter image description here

Result: enter image description here

Upvotes: 1

Related Questions