Shri
Shri

Reputation: 485

Storm KafkaSpout fails when bolt is slow

Am using Kafka-Storm integration. Kafka will load data to a queue and Kafka Spout will pull the data and processes. I have below design.

Kafka -> Queue -> KafkaSpout -> Process1 Bolt -> Process2 Bolt

Problem is, if Process2 Bolt is taking longer time to process the data the KafkaSpout is getting failed and again it tries to read data from queue this results in duplicate records.

If Bolt is processing slow why KafkaSpout is treating it as failed? what is the solution? is there any time-out or any similar properties i have to set in storm?

Upvotes: 2

Views: 1929

Answers (1)

user2720864
user2720864

Reputation: 8171

Storm will fail a tuple if it takes too long to process, by default 30 seconds. Since Storm guarantees processing, once failed the Kafka spout will replay the same message until the tuple is successfully processed.


From doc

A tuple is considered failed when its tree of messages fails to be fully processed within a specified timeout. This timeout can be configured on a topology-specific basis using the Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS configuration and defaults to 30 seconds

Upvotes: 3

Related Questions