Reputation: 3435
I have this index:
"analysis" : {
"filter" : {
"meeteor_ngram" : {
"type" : "nGram",
"min_gram" : "2",
"max_gram" : "15"
}
},
"analyzer" : {
"meeteor" : {
"filter" : [
"meeteor_ngram"
],
"tokenizer" : "standard"
}
}
},
And this document:
{
"_index" : "test_global_search",
"_type" : "meeting",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "LightBulb Innovation",
"purpose" : "The others should listen the Innovators and also improve the current process.",
"location" : "Projector should be set up.",
"meeting_notes" : [
{
"meeting_note_text" : "The immovator proposed to change the Bulb to Led."
}
],
"agenda_items" : [
{
"text" : "Discuss The Lightning"
}
]
}
}
And despite the fact that I am not doing lowercase filtering nor tokenization both of these queries are returning the document:
curl -XGET 'localhost:9200/global_search/meeting/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": "lightbulb"
}
}
}
'
curl -XGET 'localhost:9200/global_search/meeting/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": "Lightbulb"
}
}
}
'
And here is the mapping:
→ curl -XGET 'localhost:9200/global_search/_mapping/meeting?pretty'
{
"global_search" : {
"mappings" : {
"meeting" : {
"properties" : {
"agenda_items" : {
"properties" : {
"text" : {
"type" : "text",
"analyzer" : "meeteor"
}
}
},
"location" : {
"type" : "text",
"analyzer" : "meeteor"
},
"meeting_notes" : {
"properties" : {
"meeting_note_text" : {
"type" : "text",
"analyzer" : "meeteor"
}
}
},
"name" : {
"type" : "text",
"analyzer" : "meeteor"
},
"purpose" : {
"type" : "text",
"analyzer" : "meeteor"
}
}
}
}
}
}
Upvotes: 0
Views: 94
Reputation: 7649
Both LightBulb
and lightBulb
are returning your document because of the custom analyzer
you created.
Check how your analyzer is tokenizing your data.
GET global_search/_analyze?analyzer=meeteor
{
"text" : "LightBulb Innovation"
}
You will see following output:
{
"tokens": [
{
"token": "Li",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 0
},
{
"token": "Lig",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 0
},
{
"token": "Ligh",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 0
},
{
"token": "Light",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 0
},
.... other terms starting from Light
{
"token": "ig", ======> tokens below this get matched when you run your query
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 0
},
{
"token": "igh",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 0
},
{
"token": "ight",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 0
},
.... other tokens.
Now when you run match
query same custom analyzer
acts and tokens the text you searched in the above manner. and Tokens like 'ig' , 'igh'
and many more get matched. That's why match
does not seem to work.
In case of term
query, no search analyzer acts. It searches the term as it is. If you search for LightBulb
, it will be found in tokens. but lightBulb
would not be found.
Hope this clarifies your question.
Upvotes: 3
Reputation: 896
Please add the "index" : "not_analyzed"
to your name
field
"name" : {
"type" : "keyword",
"index" : true
}
Upvotes: 0