sudheeshix
sudheeshix

Reputation: 1591

Where can I find the jars folder in Spark 1.6?

From the Spark downloads page, if I download the tar file for v2.0.1, I see that it contains some jars that I find useful to include in my app.

If I download the tar file for v1.6.2 instead, I don't find the jars folder in there. Is there an alternate package type I should use from that site? I am currently choosing the default (pre-built for Hadoop 2.6). Alternately, where I can find those Spark jars - should I get each of them individually from http://spark-packages.org?

Here is an indicative bunch of jars I want to use:

Upvotes: 3

Views: 12501

Answers (1)

Samson Scharfrichter
Samson Scharfrichter

Reputation: 9067

The way Sparks ships its run-time has changed from V1 to V2.

  • In V2, by default, you have multiple JARs under $SPARK_HOME/jars
  • In V1, by default, there was just one massive spark-assembly*.jar under $SPARK_HOME/lib that contains all the dependencies.

I believe you can change the default behavior, but that would require recompiling Spark on your own...

And also, about spark-csv specifically:

  • In V2, the CSV file format is natively supported by SparkSQL
  • In V1, you have to download spark-csv (for Scala 2.10) from Spark-Packages.org plus commons-csv from Commons.Apache.org and add both JARs to your CLASSPATH
    (with --jars on command line, or with prop spark.driver.extraClassPath + instruction sc.addJar() if the command line does not work for some reason)
    ...and the syntax is more cumbersome, too


Excerpt from the vanilla $SPARK_HOME/bin/spark-class as of Spark 2.1.x (greatly simplified)

# Find Spark jars

  SPARK_JARS_DIR="${SPARK_HOME}/jars"
  LAUNCH_CLASSPATH="$SPARK_JARS_DIR/*"

And as of Spark 1.6.x

# Find assembly jar

  ASSEMBLY_DIR="${SPARK_HOME}/lib"
  ASSEMBLY_JARS="$(ls -1 "$ASSEMBLY_DIR" | grep "^spark-assembly.*hadoop.*\.jar$" || true)"
  SPARK_ASSEMBLY_JAR="${ASSEMBLY_DIR}/${ASSEMBLY_JARS}"
  LAUNCH_CLASSPATH="$SPARK_ASSEMBLY_JAR"

Upvotes: 9

Related Questions