Reputation: 4907
I have the price in the data and this price is currency formatted price.
"price": "R$ 400,00"
My requirement is to find the data having price >= 200 for e.g.
So I added the numeric_analyzer
to the price field.
{
"settings": {
"analysis": {
"filter": {
"number_only": {
"type": "pattern_replace",
"pattern": "\\D"
}
},
"analyzer": {
"numeric_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"number_only"
]
}
}
}
},
"mappings": {
"properties": {
"price": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"numeric": {
"type": "text",
"analyzer": "numeric_analyzer"
}
}
}
}
}
}
This numeric_analyzer
will remove all the non-digits. So if the price is "R$ 400,00" then price.numeric
field will have the value 40000
. Basically the price in cents. But this price will be as String type.
if I use range
query on string type then the results will not be accurate.
For e.g: Suppose price numeric field has the following prices:
- 12000 (R$ 120,00)
- 14000 (R$ 140,00)
- 130400 (R$ 1.304,00)
With the following range query:
{
"range": {
"price.numeric": {
"gte": "12000",
"lte": "14000"
}
}
}
I will get all the above prices because the prices are in String type.
Is there a way to convert this string type to integer? or any other way to achieve this?
Thanks
Upvotes: 0
Views: 1519
Reputation: 186
you can use thie pipeline to insert your data to es:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"grok": {
"field": "price",
"patterns": [
"%{WORD:w1}[$] %{NUMBER:w2:double}[,]%{NUMBER:w3:int}"
]
}
},
{
"script": {
"source": "ctx.res = ctx.w2*1000+ctx.w3"
}
}
]
},
"docs": [
{
"_source": {
"price": "R$ 140,00"
}
},
{
"_source": {
"price": "R$ 120,00"
}
},
{
"_source": {
"price": "R$ 1.304,00"
}
}
]
}
Upvotes: 1