Reputation: 10697
I have http log with field url:"/api/api_name/api_id"
example1 url: /api/apiX/0121313123
example2 url: /api/apiY/012132/optionX/1000
What is the best practice to extract from the url and ingest in elasticsearch only the "/api/api_name" and remove the id so it is suitable to visualize later in kibana distribution per api_name?
Upvotes: 0
Views: 39
Reputation: 14077
Not sure if this is the best practice, but what works for us is that we index URL as a separate field only for the API:
DELETE urls
PUT /urls
{
"settings": {
"analysis": {
"char_filter": {
"api_extractor_char_filter": {
"type": "pattern_replace",
"pattern": "/?api/([^/]+)/?.*",
"replacement": "api/$1"
}
},
"normalizer": {
"api_extractor": {
"filter": [
"lowercase",
"asciifolding"
],
"char_filter": [
"api_extractor_char_filter"
]
}
}
}
},
"mappings": {
"properties": {
"url": {
"type": "text",
"fields": {
"api": {
"type": "keyword",
"normalizer": "api_extractor"
}
}
}
}
}
}
POST /urls/_doc
{"url":"/api/apiX/0121313123"}
POST /urls/_doc
{"url":"/api/apiY/012132/optionX/1000"}
GET /urls/_search
{
"query": {
"term": {
"url.api": {
"value": "/api/apiY"
}
}
}
}
This way we keep the original URL and with the .api
field index only what you're asking for. This field can be used for exact searches and aggregations. It will work just fine for your use case.
Other possible ways:
Upvotes: 1