Reputation: 5891
I have a storm application in which I have 1 spout and 5 bolts. Topology is working fine. but I gives Too many tuple failures
error after 30min. In 1st bolt to 2nd bolt only 20% data is processed due to some analytics condition. 80% data discarded. I think this error occurred due to 80% data discarded or anything else. I don't know what's the reason and how to solve it.
Upvotes: 0
Views: 789
Reputation: 62360
If you use fault-tolerance in Storm (ie, assign message IDs to tuples in your spout), you need to ack
all tuples in the bolt that consumes the spout's output. Even if you discard some tuples due to a filter condition, because "discarding a tuple" still means, that this tuple is fully processed, ie, you need to tell Storm about this -- otherwise, Storm thinks something went wrong (due to timeout) and fails the tuple.
KafkaSpouts assign message IDs automatically. You just need to ack all incoming tuples:
void execute(Tuple input) {
if(input-is-forwarded) {
collector.emit(input, new Values(/* generate output tuple */);
}
// ack tuple (regardless if forwarded or discarded)
collector.ack(input);
}
Upvotes: 3