Nicole55
Nicole55

Reputation: 23

Why Flink run slower at standalone cluster than run in IDE?

I have run my Flink program (in Scala) both in my IDE (Intellij) and standalone cluster. In my program, I print out the running time. I got 20s when running in IDE and 74s when running in standalone cluster. I am very confused why it takes so much times running in a cluster with 10 parallelism. I am trying to compare Flink performance with Spark basically. Can someone help me to understand how can it happen ? Thank you.

Added :

Sample of my program can be found here. Time that is printed in the console for this particular code is as below:

Config for Flink standalone cluster that I've changed:

Run flink jar : flink run --class flinkutils.generated.Test2Agg2Spark ./target/scala-2.12/executorflink_2.12-0.1.jar

Upvotes: 0

Views: 219

Answers (1)

David Anderson
David Anderson

Reputation: 43717

One factor affecting the performance is that when run in the IDE everything is running within a single JVM, and data is shipped around in memory. Whereas with the cluster, the data is going through the TCP stack.

But this is a complex scenario, and many other factors may also be negatively impacting performance.

FWIW, Flink SQL gets good performance on the TPC-H benchmark (if properly configured).

Upvotes: 1

Related Questions