Reputation: 2577
I have a bunch of categories with translations in my category field. I have defined language analyzers for the fields in my index so I can search for them. But it doesnt find the singular version of my words. wasmachine
in titles.title-nl
is singular of wasmachines
but not found. What am I missing?
Demo document
"_source" : {
"google_id" : 2706,
"titles" : [
{
"title-en" : "laundry appliances",
"title-de" : "waschen & trocknen",
"title-fr" : "appareils de blanchisserie",
"title-nl" : "wasmachines"
}
]
}
Way I mapped them
PUT categories/_mapping/category
{
"dynamic": false,
"properties": {
"titles.title-nl": {
"type": "text",
"analyzer": "dutch"
},
"titles.title-en": {
"type": "text",
"analyzer": "english"
},
"titles.title-de": {
"type": "text",
"analyzer": "german"
},
"titles.title-fr": {
"type": "text",
"analyzer": "french"
}
}
}
The way I search for them
GET categories/_search
{
"size": 4,
"query": {
"multi_match": {
"query": "wasmachines",
"fields": ["titles.title-de","titles.title-en", "titles.title-fr", "titles.title-nl"]
}
}
}
Upvotes: 2
Views: 397
Reputation: 7473
The problem is that the default dutch analyzer doesn't know how to stem the word wasmachines
, you will need to recreate your index with a custom analyzer using a stemmer_override
.
Looking in the elastic documentation you can do the following to recreate the dutch
analyzer and tell that wasmachines
should be stemmed to wasmachine
, just put wasmachine => wasmachines
inside the rules for the stemmer_override
PUT categories/
{
"settings": {
"analysis": {
"filter": {
"dutch_stop": {
"type": "stop",
"stopwords": "_dutch_"
},
"dutch_keywords": {
"type": "keyword_marker",
"keywords": ["voorbeeld"]
},
"dutch_stemmer": {
"type": "stemmer",
"language": "dutch"
},
"dutch_override": {
"type": "stemmer_override",
"rules": [
"fiets=>fiets",
"bromfiets=>bromfiets",
"wasmachine=>wasmachines",
"ei=>eier",
"kind=>kinder"
]
}
},
"analyzer": {
"rebuilt_dutch": {
"tokenizer": "standard",
"filter": [
"lowercase",
"dutch_stop",
"dutch_keywords",
"dutch_override",
"dutch_stemmer"
]
}
}
}
}
}
You will also need to use that new analyzer in your mapping:
PUT categories/_mapping/category
{
"dynamic": false,
"properties": {
"titles.title-nl": {
"type": "text",
"analyzer": "rebuilt_dutch"
},
"titles.title-en": {
"type": "text",
"analyzer": "english"
},
"titles.title-de": {
"type": "text",
"analyzer": "german"
},
"titles.title-fr": {
"type": "text",
"analyzer": "french"
}
}
}
After that you will be able to search for wasmachine
and get the documents that have wasmachines
.
Upvotes: 3