Reputation: 1481
I'd like to perform a DocumentDB query that looks like SELECT * FROM c where c.teams IN (@teamsList) AND CONTAINS(c.text, "some string")
The issue is the query above is computationally intensive and nearly exceeds our S3 collection limit (this query took 2400RU's and our data set is growing quickly, we will hit the scanning limit for contains soon).
I'm aware that Azure Search is a more efficient way or search indexable fields. My question is how do I efficiently merge the results of Azure Search with other query fields, in my example, restricting it by team list. We are interested in exposing a "query builder" (Similar example available here) where CONTAINS is a permitted operand on any field.
Upvotes: 2
Views: 486
Reputation: 8003
If you want to use DocumentDB for CONTAINS word searches and avoid scans (and not use Azure Search), you can do the following:
text
into an array of words. You can do with with an off-the-shelf tokenizer like Lucene.NET. Let's say text is "This is a question"text_tokens
. The content of text_words is ["this", "is", "question"] (canonicalized to lower case, and removed stop-words)text_tokens
using ARRAY_CONTAINS(c.text_tokens, "word"). This will use the index.Upvotes: 0