Reputation: 1307
I have used Logstash for Kafka to Elastic search sync. The input topic has 8 partitions and I have used consumer_threads=8 to consumer the Kafka topic in parallel.
input { kafka { bootstrap_servers => "bootstrapServer" topics => "topicName" codec => json group_id => "groupName" id => "" consumer_threads => 8 } }
After the input section, I have a filter and Output in Logstash logic.
How can I increase the Logstash worker parallelism without affecting the ordering of data in a kafka partition?
Does Logstash using an in-memory queue in between input and (filter and output)? How to ensure that data from a single partition is consumed by a single filter and output thread of Logstash.
Upvotes: 1
Views: 799
Reputation: 4072
You cannot have multiple worker threads process data in parallel and also preserve the order of data. Even with a single thread logstash does not preserve the order of data by default, you need to set pipeline.workers to 1 and also set pipeline.ordered to 1.
Upvotes: 1