Reputation: 87260
I have a simple record in the index
CharacterId=847
CharacterId=8
returns the result (it looks like it searches for CharacterId
and 8
separatelyCharacterId=
returns the resultCharacterId=*
doesn't return anythingCharacter*
returns the resultCharacterId=8*
doesn't return anythingUpvotes: 0
Views: 711
Reputation: 30163
I will assume that your question is "Why does elasticsearch do that"? In order to answer this question, we need to take a look at how your record was indexed. Assuming that you were using default analyzer, we can see that your record is indexed as two terms characterid
and 847
:
$ curl "localhost:9200/twitter/_analyze?pretty=true" -d 'CharacterId=847'
{
"tokens" : [ {
"token" : "characterid",
"start_offset" : 0,
"end_offset" : 11,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "847",
"start_offset" : 12,
"end_offset" : 15,
"type" : "<NUM>",
"position" : 2
} ]
}
Now let's take a look at your queries:
$ curl "localhost:9200/twitter/_validate/query?explain=true&pretty=true" -d '{
"query_string": {"query":"CharacterId=8"}
}'
{
"valid" : true,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"explanations" : [ {
"index" : "twitter",
"valid" : true,
"explanation" : "_all:characterid _all:8"
} ]
}
You are right this query is searching for the term characterid
or for the term 8
. The term characterid
matches the first term of your record and you get the result back.
The second query has similar effect, but it searches only for one term characterid
.
$ curl "localhost:9200/twitter/_validate/query?explain=true&pretty=true" -d '{
"query_string": {"query":"CharacterId="}
}'
...
"explanation" : "_all:characterid"
...
The third query is processed as a wildcard query:
$ curl "localhost:9200/twitter/_validate/query?explain=true&pretty=true" -d '{
"query_string": {"query":"CharacterId=*"}
}'
...
"explanation" : "_all:characterid=*"
...
As you can see, it searches for all terms that start with characters characterid=
. Your index doesn't have any such terms, therefore it finds nothing.
The fourth query is again processed as a wildcard query:
$ curl "localhost:9200/twitter/_validate/query?explain=true&pretty=true" -d '{
"query_string": {"query":"Character*"}
}'
...
"explanation" : "_all:character*"
...
However, this time, it searches for terms that start with character
, so it matches the term characterid
.
The last query is similar to the third query:
$ curl "localhost:9200/twitter/_validate/query?explain=true&pretty=true" -d '{
"query_string": {"query":"CharacterId=8*"}
}'
...
"explanation" : "_all:characterid=8*"
...
There are no terms that start with characterid=8
, and because of these no records are returned.
If this is not the behavior that you need, you might want to consider not analyzing this field at all, or using just a lowercase analyzer.
Upvotes: 4