J.Done
J.Done

Reputation: 3033

Using same jar with Spark-submit

I deploy a job on yarn cluster mode by spark-submit with my jar file. The job deployed every time I submitted with 'same jar file', but It upload to hadoop everytime it's submitted. I think it's unnecessary routine to upload same jar every time. Is there any way to upload once and do yarn jobs with the jar?

Upvotes: 0

Views: 389

Answers (1)

Sanchit Grover
Sanchit Grover

Reputation: 1008

You can put your spark jar in hdfs and then use --master yarn-cluster mode, this way you could save the time required to upload the jar to hdfs everytime.

Other alternatives is put your jar in spark classpath on every node which has the following drawbacks:

  1. If you have more than 30 nodes it would be very tedious to scp your jar in each node.
  2. If you hadoop cluster upgrades and there is a new installation of spark, you would have to reploy.

Upvotes: 2

Related Questions