Reputation: 1543

ELK Stack and scaling

Bear with me here. I have spent the last week or so familiarising myself with the ELK Stack.

I have a working single box solution running the ELK stack, and I have the basics down on how to forward more than one type of log, and how to put them into different ES indexes.

This is all working pretty well, I would like to expand operations.

My question is more how to scale the solution out to cover more data needs/requirements.

The current solution is handling a smaller subset of data, and working fine, but I would like to aggregate a lot more data. For example I am currently pushing message tracking logs from 4 mailbox servers, I want to do the same but for 40 mailbox servers, and much, much busier ones.

I would also like to push over IIS Log files from the Client Access servers, there are 18 CAS servers, and around 30 mins of IIS logs per server during peak time were 120MB in size, with almost 1 million records.

This volume of data would most likely collapse a single box running ELK.

I haven't really looked into it but I read that ES allows for some form of clustering to add more instances, does the same apply to Logstash as well? Should Kibana be run on more than one server? or a different server to both Logstash and ES?

Upvotes: 1

Answers (2)

SimonH

Reputation: 984

As Alain mentioned, adding more ES nodes will improve performance (and give you redundancy).

On the logstash front, we have two logstash servers feeding into ES - at the moment we just direct different servers to log to the different logstash servers, but we're likely to be adding a HA-Proxy layer in front to do this automatically, and again provide redundancy.

With Kibana, I wouldn't worry too much - as far as I'm aware most of the processing is done in the client browser, and that that isn't is more dependent on the performance of the ES cluster.

Upvotes: 0

Alain Collins

Reputation: 16362

You will hit limits with logstash if you're doing a lot of processing on the records - groks, conditionals, etc. Watch the cpu utilization of the machine for hints.

For elasticsearch itself, it's about RAM and disk IO. Having more nodes in a cluster should provide both.

With two elasticsearch nodes, you'll get redundancy (a copy on both machines). Add a third, and you can start to realize an IO benefit (writing two copies to three machines spreads the IO).

The ultimate data node will have 64GB of RAM on the machine, with 31GB allocated to elasticsearch.

You'll probably want to add non-data nodes, which handle the routing of data to be indexed and the 'reduce' phase when running queries. Put two of them behind a load balancer.

Upvotes: 4

ELK Stack and scaling

Answers (2)

Related Questions