Matt Spinks
Matt Spinks

Reputation: 6698

Is this an efficient structure/query in Azure CosmosDB?

I am adding simple search functionality for my users in my app, and I am strongly considering using Azure CosmosDB. Documents in my Cosmos database (Azure) represent phone calls, and look like this:

{
    "id": "JKEeW3aebSEAzUA",
    "partitionKey": "191625028",
    "ownerId": "191625028",
    "callTime": "2020-06-12T22:13:18.271+00:00",
    "direction": "Inbound",
    "action": "Phone Call",
    "result": "Accepted",
    "callers": [
        {
            "phoneNum": "9182914018",
            "name": "JENKS        OK",
            "location": "Jenks, OK"
        },
        {
            "phoneNum": "9189406524",
            "name": "Main IVR",
            "location": null
        },
        {
            "phoneNum": null,
            "name": "Main IVR",
            "location": null,
        }
    ]
}

I am going to provide search-ability based on the nested properties phoneNum, name, and location within each callers item. I am considering using this query:

SELECT c.id,a.phoneNum,c.callers 
FROM c join a in c.callers 
where CONTAINS(a.phoneNum, '4018')

Is this the most efficient way to perform this kind of search? I am open to restructuring my documents to make searching within those fields faster. Some things to note:

  1. It's a multi-tenant system, and we are using a "partition-per-tenant" scheme in this particular database.
  2. Some partitions/tenants will have 1,000,000+ call records and 3,000,000 - 4,000,000 nested caller records when the data import is complete.

I am new to Azure CosmosDB. Currently we provide limited search functionality similar to this via SQL Server. This structure is identical to our structure in SQL Server (parent call records, child caller records).

Upvotes: 0

Views: 335

Answers (2)

David Browne - Microsoft
David Browne - Microsoft

Reputation: 89211

It's a multi-tenant system, and we are using a "partition-per-tenant" ... Is this the most efficient way to perform this kind of search?

If you don't filter to a single partition key, you will run the query on every physical partition.

A Cosmos DB physical partition is more like a separate SQL Server than it is like a SQL Server table partition.

In a multi-tenant system most queries should be scoped to a single tenant. See

https://learn.microsoft.com/en-us/azure/cosmos-db/how-to-query-container#in-partition-query

Other than that, make sure you don't exclude that property path from your indexing policy and it should be reasonable. You can always check the Request Units consumed by a query in the portal or in the result in code. In Cosmos DB you need to keep an eye on that constantly, as it translates directly into money.

You can share provisioned Request Units among all the containers in a database, and if you use a container-per-tenant model, you can select some other useful partition key for optimized access within a tenant.

Upvotes: 0

shobhonk
shobhonk

Reputation: 656

I think an optimized solution would be to have multiple container per tenant. This will solve data segregation per tenant as well.

On top of that your partition key can be based parts of one or combinations of your data fields in such way the the number of data set across your container almost equally distributed across all your partitions.

Upvotes: 1

Related Questions