Reputation: 405
I have an amazon cloudsearch domain. The aim is to filter if the field 'language' exists. Not all objects have a language, and I want to have the ones which do have a language filtered, but the ones that do not have any language to also be returned.
I want to filter with
( or language:'en' language:null )
However, null cannot be passed within a string.
Is this possible? If so how would it be done.
Upvotes: 10
Views: 3913
Reputation: 13675
You can search for existence by using the prefix
or range
operators depending on your field type. If the type is a term or a string then you can use prefix like so:
(prefix field=example '')
This will yield only results that are not null for the field example
.
For dates you can use an inclusive date range:
(range field=updated ['0000-01-01T00:00:00.000Z',})
This will only include items with an updated
date after the given time, items with a null updated date will not be included. You can do other similar searches for other field types.
Similarly you can use the not
operator to get the set of items with null fields.
For example, All items with a null example
field:
(not (prefix field=example ''))
Upvotes: 1
Reputation: 86
If you are willing to use the Lucene query parser you can express your query like this:
(*:* OR -language:*) OR language:en
Note: The funky (*:* OR ...)
construct is necessary because of the way Lucene treats negated OR clauses.
In general, you can filter by existence / non-existence of a field with the Lucene query parser:
All documents containing field
: field:[* TO *]
All documents not containing field
: -field:[* TO *]
Note: If field
is textual (text or literal datatypes) you don't need range queries and you can shorten the above to:
field:*
and -field:*
Upvotes: 6
Reputation: 405
I looked elsewhere aswell, it seems :
The simplest way to do that, is to set a default value for the field, and then use that value for your null.
For example, set the default to the string "null", then you can easily test for that.
I believe you can add a default value, and re-index, and that should reapply the default.
Upvotes: 4
Reputation: 2681
There is no way to cleanly do exactly what you want, but here are two options:
has_language
, setting its value to language!=null
at doc submission time.(range field=language [0,})
.Upvotes: 2