OK 400
OK 400

Reputation: 831

Exception: Java gateway process exited before sending its port number

I'm facing an issue when trying to use pyspark=3.1.2. I have java 1.8 installed and added in my user path. But according to the docs it does not need any other dependency.

My question is, do I have to install anything else? Like Spark itself or semething?

I'm using conda environments in VS Code.

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
k:\Deep Learning\Github\stock-pred\test_spark.ipynb Cell 2' in <cell line: 1>()
----> 1 spark = SparkSession \
      2     .builder \
      3         .appName("test-wretrwrwe") \
      4             .getOrCreate()

File ~\anaconda3\envs\prepro\lib\site-packages\pyspark\sql\session.py:228, in SparkSession.Builder.getOrCreate(self)
    226         sparkConf.set(key, value)
    227     # This SparkContext may be an existing one.
--> 228     sc = SparkContext.getOrCreate(sparkConf)
    229 # Do not update `SparkConf` for existing `SparkContext`, as it's shared
    230 # by all sessions.
    231 session = SparkSession(sc)

File ~\anaconda3\envs\prepro\lib\site-packages\pyspark\context.py:384, in SparkContext.getOrCreate(cls, conf)
    382 with SparkContext._lock:
    383     if SparkContext._active_spark_context is None:
--> 384         SparkContext(conf=conf or SparkConf())
    385     return SparkContext._active_spark_context

File ~\anaconda3\envs\prepro\lib\site-packages\pyspark\context.py:144, in SparkContext.__init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    139 if gateway is not None and gateway.gateway_parameters.auth_token is None:
    140     raise ValueError(
    141         "You are trying to pass an insecure Py4j gateway to Spark. This"
    142         " is not allowed as it is a security risk.")
--> 144 SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    145 try:
    146     self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
    147                   conf, jsc, profiler_cls)

File ~\anaconda3\envs\prepro\lib\site-packages\pyspark\context.py:331, in SparkContext._ensure_initialized(cls, instance, gateway, conf)
    329 with SparkContext._lock:
    330     if not SparkContext._gateway:
--> 331         SparkContext._gateway = gateway or launch_gateway(conf)
    332         SparkContext._jvm = SparkContext._gateway.jvm
    334     if instance:

File ~\anaconda3\envs\prepro\lib\site-packages\pyspark\java_gateway.py:108, in launch_gateway(conf, popen_kwargs)
    105     time.sleep(0.1)
    107 if not os.path.isfile(conn_info_file):
--> 108     raise Exception("Java gateway process exited before sending its port number")
    110 with open(conn_info_file, "rb") as info:
    111     gateway_port = read_int(info)

Exception: Java gateway process exited before sending its port number

Upvotes: 0

Views: 664

Answers (1)

David Wei
David Wei

Reputation: 132

Using Windows as an example.

Method 1 (temporary solution):

import os
os.environ['JAVA_HOME'] = "C:\Program Files\Java\jdk1.8.0_331" 

Method 2:

Set the system variable in the environment variables, add a new variable named "JAVA_HOME" with the value "C:\Program Files\Java\jdk1.8.0_331" enter image description here

Upvotes: 0

Related Questions