Reputation: 172
I have an elasticsearch search implementation working for a webapp but I am stuck on the last detail. I want to be able to filter certain fields alphabetically. So if I query 'd' it should bring back all that begin with 'd' for that field. At the moment this is what I have:
$elasticaQueryString = new Elastica_Query_QueryString();
$elasticaQueryString->setDefaultField('Name');
$elasticaQueryString->setQuery('d'.'*');
It works for fields that have only one work ie 'Dan'. But if there is more than one word then it returns results for each keyword. ie 'Dan Ryan', 'Ryan Dan'. I have also tried a wildcard and prefix query but they give similar results.
Do I need to create a custom analyser or is there some other way around this problem?
Upvotes: 1
Views: 1718
Reputation: 9731
I would tackle this at the mapping level first. A Keyword tokenizer will make your entire field a single token, and then adding a Lowercase filter will lowercase everything...making the field case-insensitive:
"analysis":{
"analyzer":{
"analyzer_firstletter":{
"tokenizer":"keyword",
"filter":"lowercase"
}
}
After inserting some data, this is what the index holds:
$ curl -XGET localhost:9200/test2/tweet/_search -d '{
"query": {
"match_all" :{}
}
}' | grep title
"title" : "river dog"
"title" : "data"
"title" : "drive"
"title" : "drunk"
"title" : "dzone"
Note the entry "river dog", which is what you want to avoid matching. Now, if we use a match_phrase_prefix
query, you'll only match those that start with 'd':
$ curl -XGET localhost:9200/test2/tweet/_search -d '{
"query": {
"match_phrase_prefix": {
"title": {
"query": "d",
"max_expansions": 5
}
}
}
}' | grep title
"title" : "drive"
"title" : "drunk"
"title" : "dzone"
"title" : "data"
This isn't Elastica specific, but it should be fairly easy to translate over to the appropriate commands. The important part is the keyword
+ lowercase
analyzer, and then using a match_phrase_prefix
query.
As a sidenote, wildcards are super slow and best avoided where possible :)
Upvotes: 6