Reputation: 777
I am trying to spark-submit using Amazon ec2 with the following:
spark-submit --packages org.apache.hadoop:hadoop-aws:2.7.1 --master spark://amazonaws.com SimpleApp.py
and I end up with the following error. It seems to be that it is looking for hadoop. My ec2 cluster was created using spark-ec2 command.
Ivy Default Cache set to: /home/adas/.ivy2/cache
The jars for the packages stored in: /home/adas/.ivy2/jars
:: loading settings :: url = jar:file:/home/adas/spark/spark-2.1.0-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.hadoop#hadoop-aws added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
:: resolution report :: resolve 66439ms :: artifacts dl 0ms
:: modules in use:
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 0 | 0 | 0 || 0 | 0 |
---------------------------------------------------------------------
:: problems summary ::
:::: WARNINGS
module not found: org.apache.hadoop#hadoop-aws;2.7.1
==== local-m2-cache: tried
file:/home/adas/.m2/repository/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
file:/home/adas/.m2/repository/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar
==== local-ivy-cache: tried
/home/adas/.ivy2/local/org.apache.hadoop/hadoop-aws/2.7.1/ivys/ivy.xml
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
/home/adas/.ivy2/local/org.apache.hadoop/hadoop-aws/2.7.1/jars/hadoop-aws.jar
==== central: tried
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar
==== spark-packages: tried
http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: org.apache.hadoop#hadoop-aws;2.7.1: not found
::::::::::::::::::::::::::::::::::::::::::::::
:::: ERRORS
Server access error at url https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom (java.net.NoRouteToHostException: No route to host (Host unreachable))
Server access error at url https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar (java.net.NoRouteToHostException: No route to host (Host unreachable))
Server access error at url http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom (java.net.NoRouteToHostException: No route to host (Host unreachable))
Server access error at url http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar (java.net.NoRouteToHostException: No route to host (Host unreachable))
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: org.apache.hadoop#hadoop-aws;2.7.1: not found]
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1078)
at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:296)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:160)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Upvotes: 1
Views: 1222
Reputation: 81
You are submitting the job with --packages org.apache.hadoop:hadoop-aws:2.7.1
option and job is attempting to resolve the dependencies by downloading the packages from public maven repo. However, this error indicates it's unable to reach the maven repo.
Server access error at url https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom (java.net.NoRouteToHostException: No route to host (Host unreachable))
You might want to check if the spark master has access to internet.
Upvotes: 1