DennisLi
DennisLi

Reputation: 4154

PySpark 2.4.4 got A JNI error problem with ipython 3.7

I download the latest Spark, and just changed little in config.

the change in spark-env.sh:

PYSPARK_PYTHON=/data/software/miniconda3/bin/ipython

When I run pyspark it raised error as below. Error Logs:

Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
    at java.lang.Class.getMethod0(Class.java:3018)
    at java.lang.Class.getMethod(Class.java:1784)
    at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more
[TerminalIPythonApp] WARNING | Unknown error in handling PYTHONSTARTUP file /data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py:
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/shellapp.py in _exec_file(self, fname, shell_futures)
    338                                                  self.shell.user_ns,
    339                                                  shell_futures=shell_futures,
--> 340                                                  raise_exceptions=True)
    341         finally:
    342             sys.argv = save_argv

/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py in safe_execfile(self, fname, exit_ignore, raise_exceptions, shell_futures, *where)
   2716                 py3compat.execfile(
   2717                     fname, glob, loc,
-> 2718                     self.compile if shell_futures else None)
   2719             except SystemExit as status:
   2720                 # If the call was made with 0 or None exit status (sys.exit(0)

/data/software/miniconda3/lib/python3.7/site-packages/IPython/utils/py3compat.py in execfile(fname, glob, loc, compiler)
    186     with open(fname, 'rb') as f:
    187         compiler = compiler or compile
--> 188         exec(compiler(f.read(), fname, 'exec'), glob, loc)
    189 
    190 # Refactor print statements in doctests.

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py in <module>
     36     SparkContext.setSystemProperty("spark.executor.uri", os.environ["SPARK_EXECUTOR_URI"])
     37 
---> 38 SparkContext._ensure_initialized()
     39 
     40 try:

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    314         with SparkContext._lock:
    315             if not SparkContext._gateway:
--> 316                 SparkContext._gateway = gateway or launch_gateway(conf)
    317                 SparkContext._jvm = SparkContext._gateway.jvm
    318 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in launch_gateway(conf)
     44     :return: a JVM gateway
     45     """
---> 46     return _launch_gateway(conf)
     47 
     48 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in _launch_gateway(conf, insecure)
    106 
    107             if not os.path.isfile(conn_info_file):
--> 108                 raise Exception("Java gateway process exited before sending its port number")
    109 
    110             with open(conn_info_file, "rb") as info:

Exception: Java gateway process exited before sending its port number

In [1]: exit                                                                                                                                                                                    
dennis@device2:/data/software/spark-2.4.4-bin-without-hadoop/conf$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
dennis@device2:/data/software/spark-2.4.4-bin-without-hadoop/conf$ export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
dennis@device2:/data/software/spark-2.4.4-bin-without-hadoop/conf$ pyspark 
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
    at java.lang.Class.getMethod0(Class.java:3018)
    at java.lang.Class.getMethod(Class.java:1784)
    at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more
[TerminalIPythonApp] WARNING | Unknown error in handling PYTHONSTARTUP file /data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py:
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/shellapp.py in _exec_file(self, fname, shell_futures)
    338                                                  self.shell.user_ns,
    339                                                  shell_futures=shell_futures,
--> 340                                                  raise_exceptions=True)
    341         finally:
    342             sys.argv = save_argv

/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py in safe_execfile(self, fname, exit_ignore, raise_exceptions, shell_futures, *where)
   2716                 py3compat.execfile(
   2717                     fname, glob, loc,
-> 2718                     self.compile if shell_futures else None)
   2719             except SystemExit as status:
   2720                 # If the call was made with 0 or None exit status (sys.exit(0)

/data/software/miniconda3/lib/python3.7/site-packages/IPython/utils/py3compat.py in execfile(fname, glob, loc, compiler)
    186     with open(fname, 'rb') as f:
    187         compiler = compiler or compile
--> 188         exec(compiler(f.read(), fname, 'exec'), glob, loc)
    189 
    190 # Refactor print statements in doctests.

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py in <module>
     36     SparkContext.setSystemProperty("spark.executor.uri", os.environ["SPARK_EXECUTOR_URI"])
     37 
---> 38 SparkContext._ensure_initialized()
     39 
     40 try:

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    314         with SparkContext._lock:
    315             if not SparkContext._gateway:
--> 316                 SparkContext._gateway = gateway or launch_gateway(conf)
    317                 SparkContext._jvm = SparkContext._gateway.jvm
    318 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in launch_gateway(conf)
     44     :return: a JVM gateway
     45     """
---> 46     return _launch_gateway(conf)
     47 
     48 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in _launch_gateway(conf, insecure)
    106 
    107             if not os.path.isfile(conn_info_file):
--> 108                 raise Exception("Java gateway process exited before sending its port number")
    109 
    110             with open(conn_info_file, "rb") as info:

Exception: Java gateway process exited before sending its port number

Environment:

Java:

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

The Spark version is : park-2.4.4-bin-without-hadoop

The Hadoop is 3.0.0 (CDH-6.2.0)

Upvotes: 0

Views: 776

Answers (1)

Xulang Wan
Xulang Wan

Reputation: 26

First of all, the exception happened not because of ipython 3.7, is because Spark can't find this class org.slf4j.Logger in the class path when SparkContext is being initialised (during the the launch of pyspark in this case).

According to your description, you are using a "hadoop free" build of Spark while Spark is depending on Hadoop thus you need to explicitly tell Spark where to get Hadoop's package jars according to Spark's documentation here: https://spark.apache.org/docs/latest/hadoop-provided.html and I think the class we mentioned above is somehow bounded with those jars thus Spark failed to find it.

You can try two solutions:

  1. try to update SPARK_DIST_CLASSPATH in spark-env.sh to explicitly tell Spark where to find Hadoop related jars if you have Hadoop in your machine.

  2. try use this build spark-2.4.4-bin-hadoop2.7.tgz in case you don't have Hadoop in your machine. In this build, Hadoop related jars are already put together with Spark, so you should not worry about this problem.

Upvotes: 1

Related Questions