Reputation: 3911
With the sample json shown below, am trying to retrieve all documents that contains atleast one category which is array object wrapped underneath Categories that has the text value 'drinks' with the following query but the returned result is empty. Can someone help me get this right?
SELECT items.id
,items.description
,items.Categories
FROM items
WHERE ARRAY_CONTAINS(items.Categories.Category.Text, "drink")
{
"id": "1dbaf1d0-6549-11a0-88a8-001256957023",
"Categories": {
"Category": [{
"Type": "GS1",
"Id": "10000266",
"Text": "Stimulants/Energy Drinks Ready to Drink"
}, {
"Type": "GS2",
"Id": "10000266",
"Text": "Healthy Drink"
}]
}
},
Upvotes: 0
Views: 3191
Reputation: 9523
If it were me and I was building a production system with this requirement, I'd use Azure Search. Here is some info on hooking it up to DocumentDB.
If you don't want to do that and we must live with the constraint that you can't change the shape of the documents, the only way I can think to do this is to use a User Defined Function (UDF) like this:
function GetItemsWithMatchingCategories(categories, matchingString) {
if (Array.isArray(categories) && categories !== null) {
var lowerMatchingString = matchingString.toLowerCase();
for (var index = 0; index < categories.length; index++) {
var category = categories[index];
var categoryName = category.Text.toLowerCase();
if (categoryName.indexOf(lowerMatchingString) >= 0) {
return true;
}
}
}
}
Note, the code above was modified by the asker after actually trying it out so it's somewhat tested.
You would use it with a query like this:
SELECT * FROM items WHERE udf.GetItemsWithMatchingCategories(items.Categories, "drink")
Also, note that this will result in a full table scan (unless you can combine it with other criteria that can use an index) which may or may not meet your performance/RU limit constraints.
Upvotes: 2
Reputation: 1125
You need to flatten the document in your query to get the result you want by joining the array back to the main document. The query you want would look like this:
SELECT items.id, items.Categories
FROM items
JOIN Category IN items.Categories.Category
WHERE CONTAINS(LOWER(Category.Text), "drink")
However, because there is no concept of a DISTINCT query, this will produce duplicates equal to the number of Category items that contain the word "drink". So this query would produce your example document twice like this:
[
{
"id": "1dbaf1d0-6549-11a0-88a8-001256957023",
"Categories": {
"Category": [
{
"Type": "GS1",
"Id": "10000266",
"Text": "Stimulants/Energy Drinks Ready to Drink"
},
{
"Type": "GS2",
"Id": "10000266",
"Text": "Healthy Drink"
}
]
}
},
{
"id": "1dbaf1d0-6549-11a0-88a8-001256957023",
"Categories": {
"Category": [
{
"Type": "GS1",
"Id": "10000266",
"Text": "Stimulants/Energy Drinks Ready to Drink"
},
{
"Type": "GS2",
"Id": "10000266",
"Text": "Healthy Drink"
}
]
}
}
]
This could be problematic and expensive if the Categories array holds a lot of Category items that have "drink" in them.
You can cut that down if you are only interested in a single Category by changing the query to:
SELECT items.id, Category
FROM items
JOIN Category IN items.Categories.Category
WHERE CONTAINS(LOWER(Category.Text), "drink")
Which would produce a more concise result with only the id field repeated with each matching Category item showing up once:
[{
"id": "1dbaf1d0-6549-11a0-88a8-001256957023",
"Category": {
"Type": "GS1",
"Id": "10000266",
"Text": "Stimulants/Energy Drinks Ready to Drink"
}
},
{
"id": "1dbaf1d0-6549-11a0-88a8-001256957023",
"Category": {
"Type": "GS2",
"Id": "10000266",
"Text": "Healthy Drink"
}
}]
Otherwise, you will have to filter the results when you get them back from the query to remove duplicate documents.
Upvotes: 3