Darshan Chaudhary
Darshan Chaudhary

Reputation: 2233

deleting old indexes in amazon elasticsearch

We are using AWS Elasticsearch for logs. The logs are streamed via Logstash continuously. What is the best way to periodically remove the old indexes?

I have searched and various approaches recommended are:

  1. Use lambda to delete old indexes - https://medium.com/@egonbraun/periodically-cleaning-elasticsearch-indexes-using-aws-lambda-f8df0ebf4d9f

  2. Use scheduled docker containers - http://www.tothenew.com/blog/running-curator-in-docker-container-to-remove-old-elasticsearch-indexes/

These approaches seem like an overkill for such a basic requirement as "delete indexes older than 15 days"

What is the best way to achieve that? Does AWS provide any setting that I can tweak?

Upvotes: 19

Views: 17887

Answers (3)

I followed the elasticsearch-curator documentation to install the package:

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/pip.html

Then I used the AWS base example of how to automate the indexes cleanup using the signed based authentication provided by requests_aws4auth package:

https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/curator.html

It worked like a charm.

You can decide to run this inside a lambda, docker or include it in your own DevOps cli.

Upvotes: 1

RodrigoM
RodrigoM

Reputation: 698

Elasticsearch 6.6 brings a new technology called Index Lifecycle Manager See here. Each index is assigned a lifecycle policy, which governs how the index transitions through specific stages until they are deleted.

For example, if you are indexing metrics data from a fleet of ATMs into Elasticsearch, you might define a policy that says:

  1. When the index reaches 50GB, roll over to a new index.
  2. Move the old index into the warm stage, mark it read only, and shrink it down to a single shard.
  3. After 7 days, move the index into the cold stage and move it to less expensive hardware.
  4. Delete the index once the required 30 day retention period is reached.

The technology is in beta stage yet, however is probably the way to go from now on.

Upvotes: 5

NicoKowe
NicoKowe

Reputation: 3417

Running curator is pretty light and easy.

Here you can find a Dockerfile, config and action-file.

https://github.com/zakkg3/curator

Also, Curator can help you if you need to (among others):

  • Add or remove indices (or both!) from an alias
  • Change shard routing allocation
  • Delete snapshots
  • Open closed indices
  • forceMerge indices
  • reindex indices, including from remote clusters
  • Change the number of replicas per shard for indices
  • rollover indices
  • Take a snapshot (backup) of indices
  • Restore snapshots

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html

Here is a typical action file for delete indices older than 15 days:

     actions:
      1:
        action: delete_indices
        description: >-
          Delete indices older than 15 days (based on index name), for logstash-
          prefixed indices. Ignore the error if the filter does not result in an
          actionable list of indices (ignore_empty_list) and exit cleanly.
        options:
          ignore_empty_list: True
          disable_action: True
        filters:
        - filtertype: pattern
          kind: prefix
          value: logstash-
        - filtertype: age
          source: name
          direction: older
          timestring: '%Y.%m.%d'
          unit: days
          unit_count: 15

Upvotes: 1

Related Questions