best wishes
best wishes

Reputation: 6644

Handing skewed processing time of events in a streaming application

I have a streaming application (written over spark/storm/whatever does not matter). Kafka is used as a source of stream events. Now there are some events that take significantly larger resource (time, cpu etc) compared to other events.

There are application specific nuances in how these larger messages are dealt with in various framework. For example

  1. spark streaming's batch would get blocked unless the event is processed.
  2. storm might continue processing events till max unacked messages have been reached for a partition.

Since in kafka message acking is only possible by till some messageid par partition, but not at individual message level. Whenever these larger events would come, the application would come at a halt at some point of time. And this is done to solve the tradeoff of duplicate message processing (if application dies while processing these large messages how much work can you afford to redo, since all the message after the larger message would need to be replayed). Another issue would be lag alerting, since because of larger message being stuck even if I process message after larger message, committed offset would not move.

Based on this understanding i am reaching on conclusion that, kafka is more suited when all the messages are of similar processing time in a topic (atleast spark and storm only give option of tuning things at topic level but not at individual partition level).

Hence below are the options I have

  1. Either my partitioning strategy should make sure all the message in one topic (partition level segregation would not work) need almost equal processing time.
  2. Use a streaming source where acking at individual message id level is possible for example redis queue or rabitmq
  3. Make the duplicate message handling very low cost (lets say by doing a lookup and checking if message has already been processed) and keep max unacked message limit to very high.

Are there other options to deal with these scenarios?

Upvotes: 0

Views: 89

Answers (1)

Do you need to maintain the key order processing? If you do need to maintain the order for keys, you could use a specialized consumer like Confluent's Parallel Consumer: https://github.com/confluentinc/parallel-consumer. It processes different keys in parallel while ensuring that records with the same key are processed sequentially. (it also works for unordered keys) This would process small and large records in parallel from the same partition (removing the Head-of-line Blocking issue). As you suggest, an idempotence mechanism would still be useful in case of failure.

Note that Queues for Kafka is coming in KIP-923

Upvotes: 1

Related Questions