frm
frm

Reputation: 687

How to fix 'Unsupported class file major version 55' while executing 'org.apache.spark.sql.DataSet.collectAsList()'

I'm creating a Java RESTAPI Spring Boot application that uses spark to get some data from the server. When I try to convert from Dataset to List it fails.

I've tried jdk8 and jdk11 to compile and execute the code but I get the same 'java.lang.IllegalArgumentException: Unsupported class file major version 55', in the past, I've solved this issue by updating Java version, but it's not working for this.

I'm using:

This is the code I'm executing:

Dataset<Row> dataFrame = sparkSession.read().json("/home/data/*.json");
        dataFrame.createOrReplaceTempView("events");
        Dataset<Row> resultDataFrame = sparkSession.sql("SELECT * FROM events WHERE " + predicate); 
        Dataset<Event> eventDataSet = resultDataFrame.as(Encoders.bean(Event.class));
        return eventDataSet.collectAsList();

The query works, actually while debugging you can see information in both resultDataFrame and eventDataSet.

I expect the output to be a proper list of Events, but I'm getting the exception:

[http-nio-8080-exec-2] ERROR org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet] - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is java.lang.IllegalArgumentException: Unsupported class file major version 55] with root cause
java.lang.IllegalArgumentException: Unsupported class file major version 55
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:166)
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:148)
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:136)
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:237)
    at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:49)
    at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:517)
    at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:500)
    at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
    at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
    at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
    at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
    at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
    at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
    at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:500)
.....

UPDATE BY COMMENTS: For Java 8, I change pom to aim java 8:

<java.version>1.8</java.version>

And then update project, maven clean, maven install and then run. Getting same version 55 error

Upvotes: 12

Views: 28598

Answers (3)

ForeverLearner
ForeverLearner

Reputation: 2113

Since most python developers spawn out virutalenv for the project, you could use the below snippet to check the versions of different components required for pyspark to work. The reason for the error is incompatible java version. pyspark expects java version 1.8+ and not jdk-11. Major version 55 corresponds to jdk-11 as you can see here

Check the official spark documentation only for version compatibility.

import subprocess

# subprocess to find the java , scala and python version
cmd1 = "java -version"
cmd2 = "scala -version"
cmd3 = "python --version"
cmd4 = "whoami"

arr = [cmd1, cmd2, cmd3, cmd4]

for cmd in arr:
    process = subprocess.Popen(cmd.split(" "), stdout=subprocess.PIPE,stderr=subprocess.PIPE )
    stdout,stderr=process.communicate()
    logging.info(stdout.decode("utf-8") + " | "  + stderr.decode("utf-8"))

logging.info(os.getenv("JAVA_HOME"))
logging.info(os.getenv("HOME"))

You will get the below output:

INFO:root: | openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)

INFO:root: | Scala code runner version 2.12.2 -- Copyright 2002-2017, LAMP/EPFL and Lightbend, Inc.

INFO:root:Python 3.6.9

INFO:root:training

Upvotes: 3

ManoshP
ManoshP

Reputation: 201

Excluding the default XBean artifact from spark-core dependency and adding latest version of XBean artifact, it worked for me.

<dependencies>
    <dependency>
        <groupId>org.apache.xbean</groupId>
        <artifactId>xbean-asm6-shaded</artifactId>
        <version>4.10</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.4.1</version>
        <exclusions>
            <exclusion>
                <groupId>org.apache.xbean</groupId>
                <artifactId>xbean-asm6-shaded</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
</dependencies>

Upvotes: 20

frm
frm

Reputation: 687

The root cause of the issue was a symbolic link that I have aiming the wrong JDK and that's why it wasn't working. JAVA_HOME was aiming a jdk11 and eclipse was running with that.

Upvotes: 5

Related Questions