Reputation: 687
I'm creating a Java RESTAPI Spring Boot application that uses spark to get some data from the server. When I try to convert from Dataset to List it fails.
I've tried jdk8 and jdk11 to compile and execute the code but I get the same 'java.lang.IllegalArgumentException: Unsupported class file major version 55', in the past, I've solved this issue by updating Java version, but it's not working for this.
I'm using:
JDK 11.0.2
Spring Boot 2.1.4
Spark 2.4.2
This is the code I'm executing:
Dataset<Row> dataFrame = sparkSession.read().json("/home/data/*.json");
dataFrame.createOrReplaceTempView("events");
Dataset<Row> resultDataFrame = sparkSession.sql("SELECT * FROM events WHERE " + predicate);
Dataset<Event> eventDataSet = resultDataFrame.as(Encoders.bean(Event.class));
return eventDataSet.collectAsList();
The query works, actually while debugging you can see information in both resultDataFrame and eventDataSet.
I expect the output to be a proper list of Events, but I'm getting the exception:
[http-nio-8080-exec-2] ERROR org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet] - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is java.lang.IllegalArgumentException: Unsupported class file major version 55] with root cause
java.lang.IllegalArgumentException: Unsupported class file major version 55
at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:166)
at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:148)
at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:136)
at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:237)
at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:49)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:517)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:500)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:500)
.....
UPDATE BY COMMENTS: For Java 8, I change pom to aim java 8:
<java.version>1.8</java.version>
And then update project, maven clean, maven install and then run. Getting same version 55 error
Upvotes: 12
Views: 28598
Reputation: 2113
Since most python
developers spawn out virutalenv for the project, you could use the below snippet to check the versions of different components required for pyspark
to work. The reason for the error is incompatible java version. pyspark
expects java version 1.8+
and not jdk-11
. Major version 55
corresponds to jdk-11
as you can see here
Check the official spark documentation only for version compatibility.
import subprocess
# subprocess to find the java , scala and python version
cmd1 = "java -version"
cmd2 = "scala -version"
cmd3 = "python --version"
cmd4 = "whoami"
arr = [cmd1, cmd2, cmd3, cmd4]
for cmd in arr:
process = subprocess.Popen(cmd.split(" "), stdout=subprocess.PIPE,stderr=subprocess.PIPE )
stdout,stderr=process.communicate()
logging.info(stdout.decode("utf-8") + " | " + stderr.decode("utf-8"))
logging.info(os.getenv("JAVA_HOME"))
logging.info(os.getenv("HOME"))
You will get the below output:
INFO:root: | openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)
INFO:root: | Scala code runner version 2.12.2 -- Copyright 2002-2017, LAMP/EPFL and Lightbend, Inc.
INFO:root:Python 3.6.9
INFO:root:training
Upvotes: 3
Reputation: 201
Excluding the default XBean artifact from spark-core dependency and adding latest version of XBean artifact, it worked for me.
<dependencies>
<dependency>
<groupId>org.apache.xbean</groupId>
<artifactId>xbean-asm6-shaded</artifactId>
<version>4.10</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.1</version>
<exclusions>
<exclusion>
<groupId>org.apache.xbean</groupId>
<artifactId>xbean-asm6-shaded</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
Upvotes: 20
Reputation: 687
The root cause of the issue was a symbolic link that I have aiming the wrong JDK and that's why it wasn't working. JAVA_HOME was aiming a jdk11 and eclipse was running with that.
Upvotes: 5