How does Hazelcast Jet assign task-to-CPU priority?

Question

If I have the following code and let's say I'm running on 10 nodes of 32 cores each:

IList<...> ds = ....; //large collection, eg 1e6 elements

ds
 .map() //expensive computation
 .flatMap()//generates 10,000x more elements for every 1 incoming element
 .rebalance()
 .map() //expensive computation
 ....//other transformations (ie can be a sink, keyby, flatmap, map etc)

What will Hazelcast do with respect to task-to-CPU assignment priority when the SECOND map operation wants to process 10,000 elements that was generated from the 1st original element? Will it devote the 320 CPU cores (from 10 nodes) to processing the 1st original element's 10,000 generated elements? If so, will it "boot off" already running tasks? Or, will it wait for already running tasks to complete, and then give priority to the 10,000 elements resulting from the output of the flatmap-rebalance operations? Or, would the 10,000 elements be forced to run on a single core, since the remaining 319 cores are already being consumed by the output of the ds operation (ie the input of the 1st map). Or, is there some random competition for who gets access to the CPU cores?

What I would ideally like to have happen is that a) Hazelcast does NOT boot off running tasks (it lets them complete), but when deciding which tasks gets priority to run on a core, it chooses the path that would lead to the lowest latency, ie it would process all 10,000 elements which result from the output of the flatmap-rebalance operation on all 320 cores.

Note: I asked a virtually identical question to Flink a few weeks ago, but have since switched to trying out Hazelcast: How does Flink (in streaming mode) assign task-to-CPU priority?

How does Hazelcast Jet assign task-to-CPU priority?

Answers (1)

Related Questions