James World
James World

Reputation: 29806

How can you tell if a CosmosDb collection is using a large partition key, and how many characters is 100 bytes?

Can you tell if a cosmos collection is using a large partition key hash?

This document describes the large partition key functionality in Cosmos Db: https://learn.microsoft.com/en-us/azure/cosmos-db/large-partition-keys?tabs=dotnetv3

It describes how to create a new collection in the portal and set the large partition key in the advanced settings, but it doesn't describe how to tell if an existing partition has this set. Is it possible?

How many characters is 100 bytes of a partition key?

Further, the document describes that the default is to use the first 100 bytes of the partition key. What is the underlying encoding of the partition key? That is, I am trying to ascertain how many characters in the partition key 100 bytes would be. e.g. is it based on UTF8 encoding, or UTF16? etc.

The motivation for asking is that I have inherited a database with multiple instances of conflict exceptions mentioning "Resource with id already exists with a conflicting hashed partition key, Please retry with a different partition key." in a collection where there is definitely no existing partition key + id clash but there is high cardinality of partition keys but low cardinality of the first 75-90 characters. I'm suspecting a migration to a container with a large partition key setting enabled is going to be needed, or something even more drastic. :(

Upvotes: 1

Views: 655

Answers (2)

Stanislas
Stanislas

Reputation: 2020

@james-world answer is correct, but there is an easier way to determine if a collection container is using a Large Partition Key.

Navigate to the Cosmos resource in the Azure Portal and access the "Settings" of the container via the "Data Explorer".

If the container is using a Large Partition Key, then below the "Partition key" field the following text will appear:

Large partition key has been enabled

enter image description here

If the container is not using a Large Partition Key, then no such text will be displayed.

Upvotes: 2

James World
James World

Reputation: 29806

Detecting the Partition Key Type

The partition key type can be seen in the exported ARM template - a large key has version "2":

"partitionKey": {
                        "paths": [
                            "/id"
                        ],
                        "kind": "Hash",
                        "version": 2
                }

Partition Key Hash

I did some experimentation here. I created a collection with the version 1 hash and a partition key of /pid.

With this in place I created a record with a /pid of > 100 characters and an /id of 1.

I was unable to insert an additional record with a different /pid with the same /id and the same first 100 characters in the /pid. When I had a difference in the 100th character it was OK. So I'm fairly certain the encoding is UTF8 as I was using characters in the ASCII range.

Upvotes: 3

Related Questions