Reputation: 547
I am trying to pull (and delete) all records from our database that don't have a URL with the word 'box' in it. This is the query I'm using:
{
"selector": {
"$not": {
"url": {
"$regex": ".*box.*"
}
}
},
"limit": 50
}
This query returns no records. But if I remove the $not, I get all records that do have the word 'box' in the url, but that's the opposite of what I want. Why do I get no results when adding the $not?
I have tried adding a simple base to the query like "_id":{"$gte":0} but that doesn't help.
Upvotes: 1
Views: 1520
Reputation: 1765
The "no results" issue is due to a bug in text indexes that has been recently fixed. However, neither $not
nor $regex
operators are able to take advantage of global indexes so will always result in a full database or index scan.
The way to optimise this query is to use a partial index. A partial index filters documents at indexing time rather than at query time, creating an index over a subset of the database. You then need to tell the _find
endpoint to explicitly use the partial index. For example, create an index which only includes documents not matching your regex:
POST /<db>/_index
{
"index": {
"partial_filter_selector": {
"url": {
"$not": {
"$regex": ".*box.*"
}
}
},
"fields": ["type"]
},
"ddoc" : "url-not-box",
"type" : "json"
}
then at query time:
{
"selector": {
"url": {
"$not": {
"$regex": ".*box.*"
}
}
},
"limit": 50,
"use_index": "url-not-box"
}
You can see how many documents are scanned to fulfil the query in the Cloudant UI - the execution statistics are displayed in a popup underneath the query text area.
You may also find this This article about partial indexes helpful.
Upvotes: 1
Reputation: 706
from the Cloudant doc:
You can create more complex selector expressions by combining operators. However, for Cloudant NoSQL DB Query indexes of type json, you cannot use 'combination' or 'array logical' operators such as $regex as the basis of a query.
$not
is a combination operator and therefore cannot be the basis of a query
i am able to get the following to work:
index
{
"index": {
"fields": ["url"]
},
"name" : "url-json-index",
"type" : "json"
}
query
{
"selector": {
"url": {
"$not": {
"$regex": ".*box.*"
}
}
},
"limit": 50,
"use_index": "url-json-index"
}
if you are still seeing problems, can you provide the output from _/explain
and the indexes you have in place.
Upvotes: 1