Reputation: 15136
I have a document with some uid
. I would likt to create a field which is indexed as not_analyzed
that is auto-generated as the 2-letter prefix (or suffix) of the uid
.
Is there a way to create such a template that will auto-compute that field?
The use case is for showing sampled-down (filter is prefix ='00' for example) statistics on Kibana (so a unique count
aggregation will take much less time).
I've used it succesfully, but I create the prefix field on the client before writing the document to the server.
Upvotes: 0
Views: 268
Reputation: 26
I used the edgeNGram tokenizer, seems like it provides the same result.
{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "2"
}
}
}
}
}
Upvotes: 1
Reputation: 30163
Indexing prefix can be done using a custom analyzer build with keyword
tokenizer and truncate
token filter. Here is an example of how to index a field test_prefix
containing first 2 characters of the field test
:
curl -XPUT localhost:9200/test-idx -d '{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"analysis": {
"analyzer": {
"prefix": {
"tokenizer": "keyword",
"filter": "prefix"
}
},
"filter": {
"prefix": {
"type": "truncate",
"length": 2
}
}
}
},
"mappings": {
"doc": {
"properties": {
"test": {
"type": "string",
"copy_to": "test_prefix"
},
"test_prefix": {
"type": "string",
"analyzer": "prefix"
}
}
}
}
}'
Upvotes: 1