Reputation: 581
I can see that docs say that we can set ttl
on a document but not on index/indices. Also wanted to know if it has any performance impact if we set ttl
.
Upvotes: 14
Views: 31600
Reputation: 271
Try this way, i use this to delete my expiration indexes.
First, you create a policy to describe when an index will be deleted
PUT http://localhost:9200/_ilm/policy/delete_log_after_2day <- Your policy name
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"set_priority": {
"priority": 0
}
}
},
"delete": {
"min_age": "2d", <-- Set your TTL here
"actions": {
"delete": {
"delete_searchable_snapshot": true
}
}
}
}
}
}
Next, create a template to select which type of index will use this policy.
PUT http://localhost:9200/_index_template/delete_after2day_template
{
"index_patterns": [
"test*" <-- Choose your index here
],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"index.lifecycle.name": "delete_log_after_2day"
}
}
}
Now, when you create a new index example: test001 it will be automatically deleted after 2 days.
Note: for the old index, it will not assign to your new policy, so it won't be deleted except you assign it.
You can use this API to assign all old indexes to your policy
PUT http://localhost:9200/test*/_settings <--Your old index here, can use pattern
{
"index": {
"lifecycle": {
"name": "delete_log_after_2day" <-- Your policy name
}
}
}
And then after your expiration date, all old indexes will be deleted completely.
Upvotes: 2
Reputation: 3906
_ttl
-like approach is deprecated now (cause of performance impact of reiteration over and over) and Elastic introduced index lifecycle management (ILM)
So what you would like to do instead now is a dynamic index creation each day for instance with a date-specific name pattern e.g. my-app-log-yyyy-mm-dd
and ILM policy that will handle deletion of indexes that are out of a wanted timeframe
Besides that Elastic gives you API for managing such policies i.e POST or GET hence you can automate that within your application to avoid manual work and keep it all nice and consistent.
Indexes themselves usually easily managed by loggers, Logback
for instance allow you to create dynamic indexes when you define its name in the configuration in the following way:
<index>my-app-logs-%date{yyyy-MM-dd}</index
Upvotes: 7
Reputation: 52368
_ttl
is enabled per index, but the expiration works per document.
If you want your indices to "expire", delete them. Much more simple and performant.
And yes, _ttl
has a performance impact.
The Elasticsearch "way" of dealing with "expired" data is to create time-based indices. Meaning, for each day or each week you create an index. Index everything belonging to that day/week in that index. You decide how many days you want to keep around and stick to that number.
Let's say that you want to keep the data for 7 days. In the 8th day you create the new index, as usual, then you delete the index from 8 days before. All the time you'll have in your cluster 7 indices. The ttl
mechanism checks every indices.ttl.interval
(60 seconds by default) for expired documents, it creates bulk requests out of them and deletes them. This means unnecessary requests coming to the cluster.
Instead, deleting an index is very easy and quick.
Take a look at this and how to easily manage time based indices with Curator.
Upvotes: 14