User3
User3

Reputation: 2535

How to run a standalone jar from spark.

I am very new to spark, just learning so please bear with me if I talk like a novice.

I have a regular java jar which is self contained,

The function of this jar is to listen to a queue and process some messages. Now the requirement is to read from the queue in a distributed fashion so I have a spark master and three slaves managed by Yarn. When I ./spark-submit this jar file on the standalone master all works fine. When I switch to a cluster mode by setting Yarn as master in the commandline I get lots of errors of file not found at HDFS. I read up on stack and saw that I have to mention SparkContext but however I see no use of it in my case.

There is questions here:

Do I still have to use the

SparkConf conf = new SparkConf().setMaster("yarn-cluster").setAppName("TibcoMessageConsumer");
        SparkContext sparkContext = new SparkContext(conf);

I dont see any usage of sparkContext in my case.

Upvotes: 2

Views: 7600

Answers (1)

ganeiy
ganeiy

Reputation: 294

Since you are using Yarn, copy the jar to hdfs and then you can reference that in spark-submit. If you want to use a local file system, you have to copy that jar in all the worker nodes [not recommended]

./bin/spark-submit \
--class <main-class> \
--master <master-url> \
--deploy-mode cluster \
 myapp-jar   

You can look at this link for more details

Upvotes: 1

Related Questions