prem
prem

Reputation: 624

Elastic Search and Logstash Performance Tuning

In a Single Node Elastic Search along with logstash, We tested with 20mb and 200mb file parsing to Elastic Search on Different types of the AWS instance i.e Medium, Large and Xlarge.

Environment Details : Medium instance 3.75 RAM 1 cores Storage :4 GB SSD 64-bit Network Performance: Moderate Instance running with : Logstash, Elastic search

Scenario: 1

**With default settings** 
Result :
20mb logfile 23 mins Events Per/second 175
200mb logfile 3 hrs 3 mins Events Per/second 175


Added the following to settings:
Java heap size : 2GB
bootstrap.mlockall: true
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
index.translog.flush_threshold_ops: 50000
indices.memory.index_buffer_size: 50%

# Search thread pool
threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: 100

**With added settings** 
Result:
20mb logfile 22 mins Events Per/second 180
200mb logfile 3 hrs 07 mins Events Per/second 180

Scenario 2

Environment Details : R3 Large 15.25 RAM 2 cores Storage :32 GB SSD 64-bit Network Performance: Moderate Instance running with : Logstash, Elastic search

**With default settings** 
Result :
  20mb logfile 7 mins Events Per/second 750
  200mb logfile 65 mins Events Per/second 800

Added the following to settings:
Java heap size: 7gb
other parameters same as above

**With added settings** 
Result:
20mb logfile 7 mins Events Per/second 800
200mb logfile 55 mins Events Per/second 800

Scenario 3

Environment Details : R3 High-Memory Extra Large r3.xlarge 30.5 RAM 4 cores Storage :32 GB SSD 64-bit Network Performance: Moderate Instance running with : Logstash, Elastic search

**With default settings** 
  Result:
  20mb logfile 7 mins Events Per/second 1200
  200mb logfile 34 mins Events Per/second 1200

 Added the following to settings:
    Java heap size: 15gb
    other parameters same as above

**With added settings** 
Result:
    20mb logfile 7 mins Events Per/second 1200
    200mb logfile 34 mins Events Per/second 1200

I wanted to know

  1. What is the benchmark for the performance?
  2. Is the performance meets the benchmark or is it below the benchmark
  3. Why even after i increased the elasticsearch JVM iam not able to find the difference?
  4. how do i monitor Logstash and improve its performance?

appreciate any help on this as iam new to logstash and elastic search.

Upvotes: 2

Views: 7616

Answers (2)

Sagar Ghuge
Sagar Ghuge

Reputation: 241

You have given Java Heap size correctly with respect to your total memory, but I think you are not utilizing it properly. I hope you have idea about what is fielddata size, the default is 60% of Heap size and you are reducing it to 30%.

I don't know why you are doing this, my perception might be wrong for your use-case but its good habit to allocate indices.fielddata.cache.size: "70%" or even 75%, but with this setting you must have to set something like indices.breaker.total.limit: "80%" to avoid Out Of Memory(OOM) exception. You can check this for further details on Limiting Memory Usage.

Upvotes: 0

gnom1gnom
gnom1gnom

Reputation: 752

I think this situation is related to the fact that Logstash uses fixed size queues (The Logstash event processing pipeline)

Logstash sets the size of each queue to 20. This means a maximum of 20 events can be pending for the next stage. The small queue sizes mean that Logstash simply blocks and stalls safely when there’s a heavy load or temporary pipeline problems. The alternatives would be to either have an unlimited queue or drop messages when there’s a problem. An unlimited queue can grow unbounded and eventually exceed memory, causing a crash that loses all of the queued messages.

I think what you should try is to increase the worker count with the '-w' flag.

On the other hand many people say that Logstash should be scaled horizontally, rather that adding more cores and GB of ram (How to improve Logstash performance)

Upvotes: 1

Related Questions