Reputation: 60
i Have a container in cosmos that is partitioned by a synthetic key i created, this data has around 6mil documents at the moment and the synthetic key is as follows:
'InfantCereal-Fct-4WEApr2818' where the pieces are separated by a dash and are representative of Category, Type, and Date respectively
what i find odd is that when i use a like query as follows these are the results i get:
SELECT * FROM c where c.Partition LIKE 'InfantCereal-%-4WEApr2818'
Request Charge
316.78 RUs (VERY HIGH)
-----------------------------------------------------
SELECT * FROM c where c.Partition LIKE 'InfantCereal-Fct-%'
Request Charge
17.41 RUs (VERY LOW)
-----------------------------------------------------
SELECT * FROM c where c.Partition LIKE '%-Fct-4WEApr2818'
Request Charge
297.25 RUs (VERY HIGH)
-----------------------------------------------------
SELECT * FROM c where c.Period = '4WE Apr 28 18'
Request Charge
23.56 RUs
Why is it that when querying for any date value the RU/s is very low? in this Container there would be more date difference than category and type but i don't understand why the RU/s cost is so high
also why why is it the case that when querying on Period, (which the container isn't partitioned on) costs much less RU/s than when querying?
I am quite new to Cosmos DB and i want to make sure I'm not making any mistakes in terms of my data partitioning before moving forward, any explanation on Cosmos RU/Query costs is appreciated.
Thanks!
Upvotes: 1
Views: 474
Reputation: 7623
It appears you're seeing behavior that corresponds to cases where indexes can be used for more efficient queries. In the docs for the LIKE operator, it's noted that this is equivalent to using the system function RegexMatch
. And in the RegexMatch docs, there's a note tucked away at the bottom:
This system function will benefit from a range index if the regular expression can be broken down into either StartsWith, EndsWith, Contains, or StringEquals system functions.
This indicates you'll get cost savings when your query can be optimized into one of these cases. This matches what you're documenting.
Accordingly, if you know you're doing one of those cases, you might consider using the explicit functions like STARTSWITH
instead of LIKE
to ensure the expected index treatment, if not to also be more self-documenting.
Upvotes: 1