Reputation: 1145
I have an ElasticSearch instance with Kibana, holding a lot of user-level app data that I've accumulated over a few years. One of the fields is the Java version the user is running.
I'd like to graph Java versions over time, so I can have an idea whether it's reasonable to transition to a newer version. Unfortunately I can't find a way to aggregate 1.6.0_31
, 1.6.0_32
, 1.6.0_37
, 1.6.0_51
as a single 1.6
entry, so the graph is nearly unreadable right now.
Is there a way in Kibana to aggregate the data, like a 'scripted field' that I could write a regex for? E.g. simplified_java: osjv % '\d\.\d'
which would defined simplified_java
as the part of the osjv
field that matches a digit followed by a dot followed by a digit.
Currently it looks like Kibana only supports numeric scripted fields, which makes this hard. I'm not using LogStash, as I'm not really using 'logs', but rather a custom event reporting framework in my desktop application that (opt-in) reports usage statistics, so unfortunately I can't use any of its features.
I can manually do it, but I've already imported 2G of event data, and I'd hate to have to do it again, adding a new field just for what should be computable... :(
Is there a way to create a field based on a substring or regex in Kibana, or (failing that) a way to tell ElasticSearch to transparently do the same thing?
Upvotes: 7
Views: 3101
Reputation: 514
You can definitely do scripted fields in Kibana against string data in Elasticsearch, provided it is mapped as a keyword
type. See the scripted field documentation for a tiny bit of info, and the scripted field blog post for better examples.
In short, you could do what you're looking for by building a scripted field that returns a substring:
def version = doc['osjv'].value; return (version != null) ? v.substring(0, v.lastIndexOf(".")-1) : version;
Keep in mind that there are performance implications with scripted fields since they run each time you view them.
A better approach could be to create a new field in your documents with the simplified_java
value. You won't need to re-ingest all your data from source, but can instead do an Update By Query. Your query is just match_all{}
and then you can define a script which creates the new field. So yes, there is indexing happening, but happening "in place":
POST your-index/_update_by_query
{
"script": {
"source": "def version = ctx._source.osjv; ctx._source.simplified_java = (version != null) ? version.substring(0, version.lastIndexOf(".")-1) : version",
"lang": "painless"
},
"query": {
"match_all": {}
}
}
...haven't tested either of those scripts, but would look something like them!
Upvotes: 0