Aviral Srivastava
Aviral Srivastava

Reputation: 4582

How to reduce the time taken by the glue etl job(spark) to actually start executing?

I want to start a glue etl job, though the execution is fair (time concerns), however, the time taken by glue to actually start executing the job is too much.

I looked into various documentation and answers but none of them could give me the solution. There was some explanation of this behavior: cold start but no solution.

I expect to have the job up asap, it takes sometimes around 10 mins to start a job which gets executed in 2 mins.

Upvotes: 0

Views: 2723

Answers (1)

Yuriy Bondaruk
Yuriy Bondaruk

Reputation: 4750

Unfortunately it's not possible now. Glue uses EMR under the hood and it requires some time to spin up a new cluster with desired number of executors. As far as I know they have a pool of spare EMR clusters with some most common DPU configurations so if you are lucky your job can get one and start immediately, otherwise it will wait.

Upvotes: 3

Related Questions