DWilches
DWilches

Reputation: 23035

Mapping a field's values before sorting

I want to sort my data in ElasticSearch using a field customer_priority that has the values: IMMEDIATE, HIGH, MEDIUM, LOW.

As the sort happens alphabetically, I'm getting the following undesired ordering: HIGH, IMMEDIATE, LOW, MEDIUM. But I would like: LOW, MEDIUM, HIGH, IMMEDIATE.

How can I instruct ElasticSearch to sort in an arbitrary way?

Some things I have investigated:

BTW: I'm using ES 2.3

Upvotes: 0

Views: 486

Answers (1)

Phil
Phil

Reputation: 1266

Script base sorting is the main option but it has several downsides. See this documentation for how to sort with a script (I'm assuming you're using the newest ES version 5.3 at this time).

You would add something like this for your case:

"sort" : {
    "_script" : {
        "type" : "number",
        "script" : {
            "lang": "painless",
            "inline": "def val = doc['customer_priority'].value;                           
                       if (val == 'LOW') { return 0;} 
                       if (val == 'MEDIUM') {return 1;} 
                       if (val == 'HIGH') {return 2;} 
                       if (val == 'IMPORTANT') {return 3;}"
        },
        "order" : "asc"
    }
}

Note: I didn't test this code sample.

The downsides are you will need to enable fielddata for this field customer_priority in your mapping which will increase your memory requirement. It is not enabled by default. But luckily the cardinality of this field is small (just 4 values) so the overhead is small. The other downside is that script sorting is slow as the script needs to run on each document.

Another option might be to denormalize customer_priority further by adding a number field of customer_priority_sort which has a value of 1, 2, 3, or 4 and maps to LOW, MEDIUM, HIGH, and IMPORTANT so you can then sort on this value instead of the strings.

You'd have to keep the fields in sync if they change which is extra overhead but you might be happier with the results and would be step towards just storing an integer for these enumerated values instead of the strings directly which is more disk efficient anyway.

EDIT: for ES 2.3, groovy is the preferred scripting language so you might have to update the Painless code sample above but the approach is the same. Script based sorting is supported just the same in 2.3 and 5.3, see the docs.

Upvotes: 1

Related Questions