Reputation: 357
I've being using Apache Spark for quite awhile now, but now I'm having an error that never happened before when executing the following example (I've just updated to Spark 2.1.1):
./opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/bin/run-example SparkPi
Here is the actual stacktrace:
17/07/05 10:50:54 ERROR SparkContext: Failed to add file:/opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/examples/jars/spark-warehouse/ to Spark environment
java.lang.IllegalArgumentException: Directory /opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/examples/jars/spark-warehouse is not allowed for addJar
at org.apache.spark.SparkContext.liftedTree1$1(SparkContext.scala:1735)
at org.apache.spark.SparkContext.addJar(SparkContext.scala:1729)
at org.apache.spark.SparkContext$$anonfun$11.apply(SparkContext.scala:466)
at org.apache.spark.SparkContext$$anonfun$11.apply(SparkContext.scala:466)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:466)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Pi is roughly 3.1433757168785843
Don't know if it is indeed an error or if I'm missing something, because the example is executed anyway, you can see the Pi is roughly... result at the end.
Here are the configuration lines for spark-env.sh:
export SPARK_MASTER_IP=X.X.X.X
export SPARK_MASTER_WEBUI_PORT=YYYY
export SPARK_WORKER_CORES=4
export SPARK_WORKER_MEMOiRY=7g
Here are the configuration lines for spark-defaults.sh:
spark.master local[*]
spark.driver.cores 4
spark.driver.memory 2g
spark.executor.cores 4
spark.executor.memory 4g
spark.ui.showConsoleProgress false
spark.driver.extraClassPath /opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/lib/postgresql-9.4.1207.jar
spark.eventLog.enabled true
spark.eventLog.dir file:///opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/logs
spark.history.fs.logDirectory file:///opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/logs
Apache Spark version: 2.1.1
Java version: 1.8.0_91
Python version: 2.7.5
I've tried configuring it with this, with no success:
spark.sql.warehouse.dir file:///c:/tmp/spark-warehouse
It is weird because when I compile a script and launch it with spark-submit I don't get this error. Didn't find any jira tickets or something.
Upvotes: 4
Views: 5123
Reputation: 842
I had a similar issue with my Java Spark code. Even though your issue is in Python-Spark maybe this might help you / someone.
I've to specify some dependency jars to spark using --jar option. Initially I gave the path (i.e. --jars <path-to-dependency>/
) to directory (that contains all the dependency jars) and I got the above error.
The --jars option (of spark-submit) seems to accept path only to actual jar(s) (<path-to-directory>/<name>.jar
) instead of the just the directory path (<path-to-directory>/
).
The issue resolved for me when I moved all the dependency into a single dependency jar and specify that to the --jar option as below
bash ~/spark/bin/spark-submit --class "<class-name>" --jars '<path-to-dependency-jars/<dependency-jar>.jar' --master local <dependency-jar>.jar <input-val1> <input-vale2>
Upvotes: 2
Reputation: 129
Somewhere in the code, it's telling the SparkContext to add /opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/examples/jars/spark-warehouse
as a jar.
This not allowed and it throws a java.lang.IllegalArgumentException.
You can see this at the line 1812 of the SparkContext.scala class. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala
Upvotes: 0