Dibish
Dibish

Reputation: 9303

Elastic search alphabetical sorting based on first character

I have a collection of first names.

team dhoni
dhoni1
dibeesh 200
bb vineesh
devan

I want to sort it alphabetically ascending order (A - Z) like the following order

bb vineesh
devan
dhoni1
dibeesh 200
team dhoni

Mapping

 "first_name": {
      "type": "string",
      "store": "true"
},

I have tried

{
  "sort": [
    {
      "first_name": {
        "order": "asc"

      }
    }
  ], 
 "query": {
    "match_all": {
    }
  }
}

When i run this query am getting the names in following order.

dibeesh 200
bb vineesh
devan
team dhoni
dhoni1

Elastic search taking first names with number as first preference.

How can I prevent this?

Upvotes: 7

Views: 21005

Answers (5)

Shajesh
Shajesh

Reputation: 71

The Difference in ASCII value cause difference in upper and lowercase start.So one solution (trick) is just save the same data which you wanted to sort in lowercase in some other field name.And use that field for sort.

This is not the perfect way, but while sorting data for drop down menus. this will help.

Upvotes: 1

user1642018
user1642018

Reputation:

I am using ElasticSearch 6.3 (latest at this time)

and as per documents., for text sorting you need to set type as keyword.

"title":{ 
    "type":     "text",
    "fields": {
        "raw": { 
            "type":  "keyword"
        }
    }
}

Upvotes: 4

Artem
Artem

Reputation: 59

The keyword analyzer helped me:

first_name: {
     type: "text",
     analyzer: "keyword"
}

Docs

Upvotes: 1

Tyler
Tyler

Reputation: 11499

I had a similar issue and the other answer didn't quite get it for me. I referred to this documentation instead, and was able to solve by mapping like this

"name": { 
    "type":     "string",
    "analyzer": "english",
    "fields": {
        "raw": { 
            "type":  "string",
            "index": "not_analyzed"
        }
    }
}

and then querying and sorting like this

{
    "query": {
        "match": {
            "name": "dhoni"
        }
    },
    "sort": {
        "name.raw": {
            "order": "asc"
        }
    }
}

Upvotes: 9

progrrammer
progrrammer

Reputation: 4489

I think problem is that, your string is analyzed on writing to elasticsearch. It use Standard Analyzer, An analyzer of type standard is built using the Standard Tokenizer with the Standard Token Filter, Lower Case Token Filter, and Stop Token Filter.

What does this mean, suppose you are using a field "name", with default mapping (standard analyzer).

when you index,

team dhoni, --> team, dhoni

dhoni1 --> dhoni1

dibeesh 200 --> dibeesh, 200

and so on,

so, by sorting it is obvious that dibeesh200 will come first. (because it will sort by 200 not dibesh)

So, If your string is not analyzed (upper case and lower case acts differently) or you may use simple analyzer (so that you can sort by letters only and doesn't matter upper case or lower) , or maybe you can use multifield to have analyzed and non_analyzed version.

Here is a way to do that,

POST x2/x3/_mapping
{
    "x3":{
        "properties": {
            "name" :{
                "type" :"string",
                "fields" :{
                    "raw" :{
                        "type": "string",
                        "index_analyzer": "simple"
                    }
                }
            }
        }
    }
}

And here is the query,

POST x2/x3/_search
{
    "sort": [
       {
          "name.raw": {
             "order": "asc"
          }
       }
    ]
} 

This works as expected. Hope this helps!!

Upvotes: 8

Related Questions