Valchris
Valchris

Reputation: 1481

DocumentDB Query with Azure Search fields

I'd like to perform a DocumentDB query that looks like SELECT * FROM c where c.teams IN (@teamsList) AND CONTAINS(c.text, "some string")

The issue is the query above is computationally intensive and nearly exceeds our S3 collection limit (this query took 2400RU's and our data set is growing quickly, we will hit the scanning limit for contains soon).

I'm aware that Azure Search is a more efficient way or search indexable fields. My question is how do I efficiently merge the results of Azure Search with other query fields, in my example, restricting it by team list. We are interested in exposing a "query builder" (Similar example available here) where CONTAINS is a permitted operand on any field.

Upvotes: 2

Views: 486

Answers (1)

Aravind Krishna R.
Aravind Krishna R.

Reputation: 8003

If you want to use DocumentDB for CONTAINS word searches and avoid scans (and not use Azure Search), you can do the following:

  1. You tokenize text into an array of words. You can do with with an off-the-shelf tokenizer like Lucene.NET. Let's say text is "This is a question"
  2. Store the words as an array like text_tokens. The content of text_words is ["this", "is", "question"] (canonicalized to lower case, and removed stop-words)
  3. Query the values in text_tokens using ARRAY_CONTAINS(c.text_tokens, "word"). This will use the index.

Upvotes: 0

Related Questions