toubi
toubi

Reputation: 60

Cosmos Like Query taking different RU/s

i Have a container in cosmos that is partitioned by a synthetic key i created, this data has around 6mil documents at the moment and the synthetic key is as follows:

'InfantCereal-Fct-4WEApr2818' where the pieces are separated by a dash and are representative of Category, Type, and Date respectively

what i find odd is that when i use a like query as follows these are the results i get:


SELECT * FROM c where c.Partition LIKE 'InfantCereal-%-4WEApr2818'

Request Charge
316.78 RUs (VERY HIGH)

-----------------------------------------------------

SELECT * FROM c where c.Partition LIKE 'InfantCereal-Fct-%'

Request Charge
17.41 RUs (VERY LOW)

-----------------------------------------------------

SELECT * FROM c where c.Partition LIKE '%-Fct-4WEApr2818'

Request Charge
297.25 RUs (VERY HIGH)

-----------------------------------------------------

SELECT * FROM c where c.Period = '4WE Apr 28 18' 
Request Charge
23.56 RUs

Why is it that when querying for any date value the RU/s is very low? in this Container there would be more date difference than category and type but i don't understand why the RU/s cost is so high

also why why is it the case that when querying on Period, (which the container isn't partitioned on) costs much less RU/s than when querying?

I am quite new to Cosmos DB and i want to make sure I'm not making any mistakes in terms of my data partitioning before moving forward, any explanation on Cosmos RU/Query costs is appreciated.

Thanks!

Upvotes: 1

Views: 474

Answers (1)

Noah Stahl
Noah Stahl

Reputation: 7623

It appears you're seeing behavior that corresponds to cases where indexes can be used for more efficient queries. In the docs for the LIKE operator, it's noted that this is equivalent to using the system function RegexMatch. And in the RegexMatch docs, there's a note tucked away at the bottom:

This system function will benefit from a range index if the regular expression can be broken down into either StartsWith, EndsWith, Contains, or StringEquals system functions.

This indicates you'll get cost savings when your query can be optimized into one of these cases. This matches what you're documenting.

Accordingly, if you know you're doing one of those cases, you might consider using the explicit functions like STARTSWITH instead of LIKE to ensure the expected index treatment, if not to also be more self-documenting.

Upvotes: 1

Related Questions