Reputation: 811
I'm having trouble importing magellan-1.0.4-s_2.11
in spark notebook. I've downloaded the jar from https://spark-packages.org/package/harsha2010/magellan and have tried placing SPARK_HOME/bin/spark-shell --packages harsha2010:magellan:1.0.4-s_2.11
in the Start of Customized Settings
section of the spark-notebook file of the bin folder.
Here are my imports
import magellan.{Point, Polygon, PolyLine}
import magellan.coord.NAD83
import org.apache.spark.sql.magellan.MagellanContext
import org.apache.spark.sql.magellan.dsl.expressions._
import org.apache.spark.sql.Row
import org.apache.spark.sql.types._
And my errors...
<console>:71: error: object Point is not a member of package org.apache.spark.sql.magellan
import magellan.{Point, Polygon, PolyLine}
^
<console>:72: error: object coord is not a member of package org.apache.spark.sql.magellan
import magellan.coord.NAD83
^
<console>:73: error: object MagellanContext is not a member of package org.apache.spark.sql.magellan
import org.apache.spark.sql.magellan.MagellanContext
I then tried to import the new library like any other library by placing it into the main script
like so:
$lib_dir/magellan-1.0.4-s_2.11.jar"
This didn't work and I'm left scratching my head wondering what I've done wrong. How do I import libraries such as magellan into spark notebook?
Upvotes: 9
Views: 6464
Reputation: 11
The easy way, you should set or add the EXTRA_CLASSPATH
environnent variable to point to your .jar
file downloaded :
export EXTRA_CLASSPATH = </link/to/your.jar>
or set EXTRA_CLASSPATH= </link/to/your.jar>
in wondows OS. Here find the detailed solution.
Upvotes: 0
Reputation: 224
I would suggest to check this:
and
https://github.com/spark-notebook/spark-notebook/blob/master/docs/metadata.md#add-spark-packages
I think the :dp
magic command is depreciated, instead you should add your custom dependencies in the notebook metadata. You can go in the menu Edit > Edit notebook metadata, there add something like:
"customDeps": [
"harsha2010 % magellan % 1.0.4-s_2.11"
]
Once done, you will need to restart the kernel, you can check in the browser console if the package is being downloaded properly.
Upvotes: 1
Reputation: 27535
Try evaluating something like
:dp "harsha2010" % "magellan" % "1.0.4-s_2.11"
It will load the library into Spark, allowing it to be import
ed - assuming it can be obtained though the Maven repo. In my case it failed with a message:
failed to load 'harsha2010:magellan:jar:1.0.4-s_2.11 (runtime)' from ["Maven2 local (file:/home/dev/.m2/repository/, releases+snapshots) without authentication", "maven-central (http://repo1.maven.org/maven2/, releases+snapshots) without authentication", "spark-packages (http://dl.bintray.com/spark-packages/maven/, releases+snapshots) without authentication", "oss-sonatype (https://oss.sonatype.org/content/repositories/releases/, releases+snapshots) without authentication"] into /tmp/spark-notebook/aether/b2c7d8c5-1f56-4460-ad39-24c4e93a9786
I think file was to big and connection was interrupted before whole file could be downloaded.
Workaround
So I downloaded the JAR manually from:
http://dl.bintray.com/spark-packages/maven/harsha2010/magellan/1.0.4-s_2.11/
and copied it into the:
/tmp/spark-notebook/aether/b2c7d8c5-1f56-4460-ad39-24c4e93a9786/harsha2010/magellan/1.0.4-s_2.11
And then :dp
command worked. Try Calling it first, and if it will fail copy JAR into the right path to make things work.
Better solution
I should investigate why download failed to fix it in the first place... or put that library in my local M2 repo. But that should get you going.
Upvotes: 1