Konstantin Bodnia
Konstantin Bodnia

Reputation: 1513

Start parallel processing at the same time in Akka Streams

I'm trying to do a trick with Akka Streams, where a batch of elements would get processed at the same time simultaneously. I noticed that even if you create a Balance and feed it with a sequence it would start execution for every element once it comes to the flow.

Is there any way to batch or buffer elements until they reach some threshold and then start parallel execution at the same time? Can it be done with Akka Streams tools, or maybe it will need some java/scala concurrency coding?

Upvotes: 0

Views: 694

Answers (1)

Mateusz Kubuszok
Mateusz Kubuszok

Reputation: 27535

You have some options.

There is a whole family of grouping functions grouped(Int), groupWithin(Int, FiniteDuration), etc, that you can use to create a collection of elements emitted until some threshold is filled and/or within time window, etc. Once you have this batch, you could mapAsync it, and there you could use some fine grained control over Future, e.g. you could create Future for each element, merge them with Future.sequence and map the result of parallel operations.

stream
  .grouped(10)
  .mapAsync(1) { collection =>
     // create future processing all values in collection at once
  }

If you have no problem about processing more than one batch at once you can increase parallelism in mapAsync. If you don't need to combine the grouped values in any way then maybe mapAsync with higher parallelism (or mapAsyncUnordered) will be enough for your needs.

You have to remember though that values in both grouped and in mapAsync have to be set up reasonably, because e.g. if you try to group 1M elements you might run into OOM errors.

Upvotes: 1

Related Questions