Reputation: 423
Recently I've been doing performance tests on Spark Streaming. But some problems puzzled me a lot.
In Spark Streaming, receivers are scheduled to run in executors on worker nodes.
Upvotes: 3
Views: 357
Reputation: 67075
There is only one receiver per DStream
, but you can create more than one DStream
and union
them together to act as one. This is why it is suggested to run Spark Streaming
against a cluster that is at least N
(receivers) + 1 cores. Once the data is past the receiving portion, it is mostly a simple Spark
application and follows the same rules of a batch job. (This is why streaming is referred to as micro-batching)
Upvotes: 2