Why does starting spark-shell fail with "we couldn't find any external IP address!" on Windows?

Question

I am having trouble in starting spark-shell on my Windows computer now. The version of Spark I am using is 1.5.2 pre-built for Hadoop 2.4 or later. I think spark-shell.cmd could be run directly without any configuration since it is pre-built and I cannot figure out what is the problem that prevents me from starting Spark correctly.

Aside from the error message printed out I can still execute some basic scala command on the command line, but apparently something is going wrong here.

Here is the error log from cmd:

   log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.li
b.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more in
fo.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.propertie
s
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.5.2
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_25)
Type in expressions to have them evaluated.
Type :help for more information.
15/11/18 17:51:32 WARN MetricsSystem: Using default name DAGScheduler for source
 because spark.app.id is not set.
Spark context available as sc.
15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus" is already reg
istered. Ensure you dont have multiple JAR versions of the same plugin in the cl
asspath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucleus-core-3.2.10
.jar" is already registered, and you are trying to register an identical plugin
located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanucleus-core-3
.2.10.jar."
15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is
 already registered. Ensure you dont have multiple JAR versions of the same plug
in in the classpath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucleus
-rdbms-3.2.9.jar" is already registered, and you are trying to register an ident
ical plugin located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanu
cleus-rdbms-3.2.9.jar."
15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is alr
eady registered. Ensure you dont have multiple JAR versions of the same plugin i
n the classpath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanucl
eus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an
identical plugin located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucl
eus-api-jdo-3.2.6.jar."
15/11/18 17:51:39 WARN Connection: BoneCP specified but not present in CLASSPATH
 (or one of dependencies)
15/11/18 17:51:40 WARN Connection: BoneCP specified but not present in CLASSPATH
 (or one of dependencies)
15/11/18 17:51:46 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the schema versio
n 1.2.0
15/11/18 17:51:46 WARN ObjectStore: Failed to get database default, returning No
SuchObjectException
15/11/18 17:51:47 WARN : Your hostname, Lenovo-PC resolves to a loopback/non-reachab
le address: fe80:0:0:0:297a:e76d:828:59dc%wlan2, but we couldn't find any extern
al IP address!
java.lang.RuntimeException: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav
a:522)
        at org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.s
cala:171)
        at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveCo
ntext.scala:162)
        at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala
:160)
        at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:167)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct
orAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC
onstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
        at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:10
28)
        at $iwC$$iwC.(:9)
        at $iwC.(:18)
        at (:20)
        at .(:24)
        at .()
        at .(:7)
        at .()
        at $print()
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:
1065)
        at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:
1340)
        at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840
)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
        at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:8
57)
        at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.sca
la:902)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
        at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply
(SparkILoopInit.scala:132)
        at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply
(SparkILoopInit.scala:124)
        at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
        at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoop
Init.scala:124)
        at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)

        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
        at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.s
cala:159)
        at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
        at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkIL
oopInit.scala:108)
        at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:
64)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1.apply(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1.apply(SparkILoop.scala:945)
        at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClass
Loader.scala:135)
        at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$pr
ocess(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSub
mit$$runMain(SparkSubmit.scala:674)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:18
0)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NullPointerException
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
650)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
        at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1097)
        at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.
loadPermissionInfo(RawLocalFileSystem.java:559)
        at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.
getPermission(RawLocalFileSystem.java:534)
        at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(Sess
ionState.java:599)
        at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(Sess
ionState.java:554)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav
a:508)
        ... 56 more

:10: error: not found: value sqlContext
       import sqlContext.implicits._
              ^
:10: error: not found: value sqlContext
       import sqlContext.sql
              ^

Jacek Laskowski · Accepted Answer

There are a couple of issues. You're on Windows and things are different on this OS comparing to other POSIX-compliant OSes.

Start by reading Problems running Hadoop on Windows document and see if "missing WINUTILS.EXE" is the issue. Make sure you run spark-shell in console with admin rights.

You may also want to read the answers to a similar question Why does starting spark-shell fail with NullPointerException on Windows?

Also, you may have started spark-shell inside bin subdirectory and hence the errors like:

15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar."

And the last issue:

15/11/18 17:51:47 WARN : Your hostname, Lenovo-PC resolves to a loopback/non-reachable address: fe80:0:0:0:297a:e76d:828:59dc%wlan2, but we couldn't find any external IP address!

One workaround is to set SPARK_LOCAL_HOSTNAME to some resolvable host name and be done with it.

SPARK_LOCAL_HOSTNAME is the custom host name that overrides any other candidates for hostname when driver, master, workers, and executors are created.

In your case, using spark-shell, just execute the following:

SPARK_LOCAL_HOSTNAME=localhost ./bin/spark-shell

You can also use:

./bin/spark-shell -c spark.driver.host=localhost

Refer also to Environment Variables in the official documentation of Spark.

Why does starting spark-shell fail with "we couldn't find any external IP address!" on Windows?

Answers (1)

Related Questions

Why does starting spark-shell fail with &quot;we couldn&#39;t find any external IP address!&quot; on Windows?

Answers (1)

Related Questions

Why does starting spark-shell fail with "we couldn't find any external IP address!" on Windows?