Reputation: 145
I am trying to formulate a query for a business scenario where we have a nested field type named "types"(i.e like ArrayList of strings). Below are the sample indexed documents with "types" as one of the fields.
Document 1: { "types" : [ { "Label" : "Dialog", }, { "Label" : "Violence", }, { "Label" : "Language", } }
Document 2: { "types" : [ { "Label" : "Dialog", } }
Now, the requirement is that the search query should match at most one value within the field values i.e if a user searches for "Dialog", then it should return only Document 2 and not Document 1 because it has other values present in the field. Basically, it should only get those records that match exactly with the single search query value excluding all the other values present in the field.
Below is the Mapping:
{
"media-hub-asset-metadata": {
"mappings": {
"dynamic": "true",
"properties": {
"Metadata": {
"properties": {
"Actors": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase_normalizer"
},
"ngram": {
"type": "text",
"analyzer": "ngram_tokenizer_analyzer"
}
}
},
"Types": {
"type": "nested",
"properties": {
"Acronym": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Display": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Label": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase_normalizer"
},
"ngram": {
"type": "text",
"analyzer": "ngram_tokenizer_analyzer"
}
}
},
"TVLabel": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase_normalizer"
},
"ngram": {
"type": "text",
"analyzer": "ngram_tokenizer_analyzer"
}
}
}
}
}
}
},
"MetadataType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase_normalizer"
},
"ngram": {
"type": "text",
"analyzer": "ngram_tokenizer_analyzer"
}
}
},
"Network": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
Sample Indexed Document:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 9139,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "media-hub-asset-metadata",
"_type" : "_doc",
"_id" : "1640655|VOD",
"_score" : 1.0,
"_source" : {
"AssetId" : 1640655,
"MaterialId" : "XMX1311",
"Metadata" : {
"Actors" : [
"Owen, Clive",
"Mueller-Stahl, Armin",
"Watts, Naomi"
],
"AirDate" : "2013-05-01T00:00:00Z",
"ClosedCaption" : true,
"Code" : "",
"Types" : [
{
"Label" : "Dialog",
"TVLabel" : "D"
},
{
"Label" : "Violence",
"TVLabel" : "V"
},
{
"Label" : "Language",
"TVLabel" : "L"
}
]
},
"MetadataType" : "VOD"
}
}
]
}
}
Any help is greatly appreciated! Thanks in advance
Upvotes: 0
Views: 635
Reputation: 16192
You need to use script_score
along with the function score query.
Try out this below query
{
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"nested": {
"path": "types",
"query": {
"bool": {
"must": [
{
"match": {
"types.Label": "Dialog"
}
}
]
}
}
}
}
]
}
},
"functions": [
{
"script_score": {
"script": {
"source": "params._source.containsKey('types') && params._source['types'] != null && params._source.types.size() == 1 ? 1 : 0"
}
}
}
],
"min_score": 0.5 // note this
}
}
}
And search result will be
"hits": [
{
"_index": "67594441",
"_type": "_doc",
"_id": "2",
"_score": 0.53899646,
"_source": {
"types": [
{
"Label": "Dialog"
}
]
}
]
Upvotes: 1