SDG
SDG

Reputation: 2342

Cannot seem to initialize a spark context (pyspark)

I have included the entire error below, when I try to run sc = SparkContext(appName="exampleName"):

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/sharan/.local/lib/python3.5/site-packages/pyspark/context.py", line 118, in __init__
    conf, jsc, profiler_cls)
  File "/home/sharan/.local/lib/python3.5/site-packages/pyspark/context.py", line 188, in _do_init
    self._javaAccumulator = self._jvm.PythonAccumulatorV2(host, port)
  File "/home/sharan/.local/lib/python3.5/site-packages/py4j/java_gateway.py", line 1525, in __call__
    answer, self._gateway_client, None, self._fqn)
  File "/home/sharan/.local/lib/python3.5/site-packages/py4j/protocol.py", line 332, in get_return_value
    format(target_id, ".", name, value))
py4j.protocol.Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace:
py4j.Py4JException: Constructor org.apache.spark.api.python.PythonAccumulatorV2([class java.lang.String, class java.lang.Integer]) does not exist
    at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179)
    at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196)
    at py4j.Gateway.invoke(Gateway.java:237)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)

I have no idea on how to debug this. Are there any logs that I can access? Am I missing a specific package that I should be having on my ubuntu computer?

Upvotes: 0

Views: 1710

Answers (1)

Shivam Agrawal
Shivam Agrawal

Reputation: 491

This is due to pyspark version is different from spark version. If you have installed spark version 2.4.7 then use pyspark version 2.4.7 as well.

To get the spark version, check it on spark UI or use anyone of the following commands

spark-submit --version or spark-shell --version or spark-sql --version

To install the specific version of pyspark use the following command

pip install pyspark==2.4.7

Upvotes: 1

Related Questions