Kousha
Kousha

Reputation: 36189

Elasticsearch aggregate by field prefix

I have data entries of the form

{
   "id": "ABCxxx",
   // Other fields
}

Where ABC is a unique identifier that defines the "type" of this record. (For example a user would be USR1234..., an image would be IMG1234...)

I want to get a list of all the different types of records that I have in my ES. So in essence I want to do a sort by id but only looking at the first three characters of the id.

This doesn't work obviously, because it sorts by id (so USR123 is different than USR456):

{
    "fields": ["id"],
    "aggs": {
        "group_by_id": {
            "terms": {
                "field": "id"
            }
        }
    }
}

How do I write this query?

Upvotes: 0

Views: 1972

Answers (2)

Rahul
Rahul

Reputation: 16335

As suggested by paqash already that the same can be achieved via script but I would suggest an alternate of storing "type" as a different field altogether in your schema.

For eg.

USR1234 : {id:"USR1234", type:"USR"}
IMG1234 : {id:"USR1234", type:"IMG"}

This would avoid unnecessary complications in scripting and keep your query interface clean.

Upvotes: 1

paqash
paqash

Reputation: 2314

You can use the painless scripting language to get this accomplished.

{
    "fields": ["id"],
    "aggs": {
        "group_by_id": {
            "terms": {
                "script" : {
                    "inline": "doc['id'].substring(0,3)",
                    "lang": "painless"
                }
            }
        }
    }
}

More info here. Please note that the syntax for the substring method may not be exactly right.

Upvotes: 1

Related Questions