DAVID_ROA
DAVID_ROA

Reputation: 309

setJars() method in running Spark SQL app from IDE

I am creating a Spark Sql Application and I want to run it on remote spark cluster from my local machine with my IDE. I know that I should set some option when I create SparkConf Object, smth like this:

SparkConf conf = new SparkConf()
.setMaster("spark://SPARK-MASTER-ADDRESS:7077")
.set("spark.driver.host","my local IP Address")
.setJars(new String[]{"build\\libs\\spark-test-1.0-SNAPSHOT.jar"})
.setAppName("APP-NAME");

It's working from IDE and every thing is OK,

but my questions are:

1) Do I need to rebuild the jar file of my app and set it's path to setJars method, every time I change anything? I saw that in some Forums had been said: you will need to build the jar every time you change anything. but It looks a little hard to rebuild app's jar file every time. Is there a better way for that?

2) Why is it sometimes not necessary to use setJars method, although I run the program through IDE ? For Example, When I do not use lambda function in my code there is no need to setjars function. Just Assume I have a class of person that have two field: CustomerNo, AccountNo. When I use lamba function in my code like this (personDS is a dataset of person object):

personDS.filter(f -> f.getCustomerNo().equals("001")).show();

the following error occurs:

java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD

but when I don't use lamba function in my code like this:

personDS.filter(col("customerNo").equalTo(001)).show();

No Error Occurs. So, Why is this happend? Why I have to use setJars when I use lambda function? When I should use setJars and when not?

Upvotes: 0

Views: 541

Answers (1)

Raj
Raj

Reputation: 727

So, here i am assuming you are not using spark-submit facility and you are running spark program directly from your IDE.

Below is my answer to your first question:

1) Do I need to rebuild the jar file of my app, every time I change anything? - YES to deploy your changes you need to build jar each time you make change in code.I use maven for same.

for second question :

I think that whenever you do any kind of map operation using a lambda which is referring to methods/classes of your project, you need to supply them as an additional jar.

Upvotes: 2

Related Questions