willz
willz

Reputation: 2050

Elasticsearch sorting on string not returning expected results

When sorting on a string field with multiple words, Elasticsearch is splitting the string value and using the min or max as the sort value. I.E.: when sorting on a field with the value "Eye of the Tiger" in ascending order, the sort value is: "Eye" and when sorting in descending order the value is: "Tiger".

Lets say I have "Eye of the Tiger" and "Wheel of Death" as entries in my index, when I do an ascending sort on this field, I would expect, "Eye of the Tiger" to be first, since "E" comes before "W", but what I'm seeing when sorting on this field, "Wheel of Death" is coming up first, since "D" is the min value of that term and "E" is the min value of "Eye of the Tiger".

Does anyone know how to turn off this behavior and just allow a regular sort on this string field?

Upvotes: 13

Views: 9531

Answers (2)

Felix
Felix

Reputation: 6104

If you want the sorting to be case-insensitive "index": "not_analyzed" doesn't work, so I've created a custom sort analyzer.

index-settings.yml

index :   
    analysis :
        analyzer :
            sort :
                type : custom
                tokenizer : keyword
                filter : [lowercase]

Mapping:

...
"articleName": {
    "type": "string",
    "analyzer": "standard",
    "fields": {
        "sort": {
            "type": "string",
            "analyzer": "sort"
        }
    }
}
...

Upvotes: 4

Michael at qbox.io
Michael at qbox.io

Reputation: 321

As mconlin mentioned if you want to sort on the unanalyzed doc field you need to specify "index": "not_analyzed" to sort as you described. But if you're looking to be able to keep this field tokenized to search on, this post by sloan shows a great example. Using multi-field to keep two different mappings for a field is very common in Elasticsearch.

Hope this helps, let me know if I can offer more explanation.

Upvotes: 10

Related Questions