Paul Stanley
Paul Stanley

Reputation: 2171

CosmosDb String search

I am running into trouble converting a SQL Server query into CosmosDb query. The trouble is the time to search a field. My original table in SQL Server was for retrieving addresses. As I typed in a search box a method called a stored procedure in SQL Server which searched a field on the table with a formatted address. Note the wildcard.

ALTER  PROC [dbo].[CBAddressGet]
@TEST  VARCHAR(250)
AS
set concat_null_yields_null off 
set @test = @test + '%'
select ID,address, address0,address1,address2,town,postCode,lat,lon  from 
roads_db
where address  like @test
order by ADDRESS 

The table contained 50,000 rows and the query returned exceptionally quickly under 500ms. I uploaded the table to COsmosDb. Now if I try to search for a CosmosDb for a document with an addresskey = '31 Village Road HarperVille' with a query like:

Stopwatch sw = new Stopwatch();
sw.Start();
string sql = "SELECT TOP 3 * from c WHERE c.docType = 'address' AND  STARTSWITH(c.addresskey,'31 Vill')"; 
var results =
 docClient.CreateDocumentQuery<Address>(UriFactory.CreateDocumentCollectionUri("ct", "ops"),
           sql).AsEnumerable().ToList();
sw.Stop();
System.Console.WriteLine(sw.ElapsedMilliseconds);

A search like this is taking up to 45 seconds. There must be something seriously wrong somewhere. Is my indexing policy wrong:

{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
    {
        "path": "/*",
        "indexes": [
            {
                "kind": "Range",
                "dataType": "Number",
                "precision": -1
            },
            {
                "kind": "Hash",
                "dataType": "String",
                "precision": -1
            }
        ]
    },
    {
        "path": "/bookedAt/?",
        "indexes": [
            {
                "kind": "Range",
                "dataType": "String",
                "precision": -1
            },
            {
                "kind": "Range",
                "dataType": "Number",
                "precision": -1
            }
        ]
    },
    {
        "path": "/bookingTime/?",
        "indexes": [
            {
                "kind": "Range",
                "dataType": "String",
                "precision": -1
            },
            {
                "kind": "Range",
                "dataType": "Number",
                "precision": -1
            }
        ]
    },
    {
        "path": "/timeMark/?",
        "indexes": [
            {
                "kind": "Range",
                "dataType": "String",
                "precision": -1
            },
            {
                "kind": "Range",
                "dataType": "Number",
                "precision": -1
            }
        ]
    }
],
"excludedPaths": []

}

The data size is 5Gb on the database.

Upvotes: 0

Views: 276

Answers (1)

Krishnan Sundaram
Krishnan Sundaram

Reputation: 1401

I am from the CosmosDB engineering team. Please ensure that you have range index on addressKey defined in your indexing policy. STARTSWITH is not efficient if the field isn't range-indexed.

Upvotes: 1

Related Questions