Xperia
Xperia

Reputation: 79

Should I using async code or sync code in apache flink

When my application interacts with IO (database, third API,...), I'm using Async as a recommendation of Flink: https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html. But my application usually interacts with DB, should I always use async?

I have many questions:

  1. If I using async (completablefuture), then my application is not blocked as when using sync code => In terms of performance, async code is better sync code?
  2. What is the performance if I using sync code and increase parallel?
  3. When I using async code, After a while, my application throw Exception "Mailbox is in state CLOSED, but is required to be in state OPEN for put operations". Look like it's related to open many threads?

Upvotes: 0

Views: 1509

Answers (1)

David Anderson
David Anderson

Reputation: 43524

Using asynchronous i/o is better for these reasons:

  1. Better resource utilization. If you make synchronous requests then one task will be handling just one request at a time. With asynchronous requests, a single task can be handling dozens of in-flight requests.
  2. While your user code is blocked waiting for a response to a synchronous request, that operator cannot participate in checkpointing. In the best case this makes checkpointing slow, and it can lead to checkpoint timeouts and job failure.

Yes, you can make synchronous i/o work by increasing the parallelism. But that's throwing resources at a problem that has a better solution.

As for the Mailbox problem, I believe this can only occur if the job is shutting down. I think this is a side effect of some other problem that has caused the job to fail. Maybe look around in the logs for other indications of what's going on.

Upvotes: 2

Related Questions