Zeeshan
Zeeshan

Reputation: 12421

Is logstash input stage multithreaded?

I was looking at the logstash pipeline.workers option which states that

-w, --pipeline.workers COUNT

Sets the number of pipeline workers to run. This option sets the number of workers that will, in parallel, execute the filter and output stages of the pipeline. If you find that events are backing up, or that the CPU is not saturated, consider increasing this number to better utilize machine processing power. The default is the number of the host’s CPU cores.

I was wondering if logstash input stage also uses all the cores of my machine:

input {
  kafka {
    bootstrap_servers=>"kfk1:9092,kfk2:9092"
    topics => ["mytopic"]
    group_id => "mygroup"
    key_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
    value_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
    codec => avro {
      schema_uri => "/apps/schema/rocana3.schema"
    }
  }
}

Does this input > kafka > codec > avro also utilizes all the cores of my machine or this a single threaded stage?

Upvotes: 1

Views: 2390

Answers (1)

sysadmin1138
sysadmin1138

Reputation: 1303

Logstash input pipelining has a few quirks in it. It can be multithreaded, but it takes some configuration. There are two ways to do it:

  • The input plugin has a workers parameter, not many do.
  • Each input {} block will run on its own thread.

So, if you're running the file {} input plugin, which lacks a worker config option, each file you define will be serviced by one and only one thread.

Codecs run in the context of the plugin that calls them, which typically are single-threaded per invocation.

Where most Logstash deployments I've run into end up using many cores is in the filter {} stage of the pipeline, not the input. That is why Logstash provides a way to configure the number of pipeline workers. For an input, or set of inputs, pulling thousands of events a second, you can load up a box pretty far solely on the filter {} to output {} pipeline.

Upvotes: 1

Related Questions