Reputation: 21356
A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.
According to the Spark overview, Spark works with Java 17. I'm using Temurin-17.0.4+8 (build 17.0.4+8) on Windows 10, including Spark 3.3.0 in Maven like this:
<scala.version>2.13</scala.version>
<spark.version>3.3.0</spark.version>
...
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
I try to run a simple program:
final SparkSession spark = SparkSession.builder().appName("Foo Bar").master("local").getOrCreate();
final Dataset<Row> df = spark.read().format("csv").option("header", "false").load("/path/to/file.csv");
df.show(5);
That breaks all over the place:
Caused by: java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x59d016c9) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x59d016c9
at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala:213)
at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:114)
at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:353)
at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:290)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:339)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:464)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
at scala.Option.getOrElse(Option.scala:201)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
Spark is obviously doing things one is not supposed to do in Java 17.
Disappointing. How do I get around this?
Upvotes: 72
Views: 71881
Reputation: 2194
For those who get this error when running tests using maven, add the below snippet to maven pom.xml.
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<argLine>
--add-opens=java.base/java.lang=ALL-UNNAMED,
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED,
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED,
--add-opens=java.base/java.io=ALL-UNNAMED,
--add-opens=java.base/java.net=ALL-UNNAMED,
--add-opens=java.base/java.nio=ALL-UNNAMED,
--add-opens=java.base/java.util=ALL-UNNAMED,
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED,
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED,
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED,
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED,
--add-opens=java.base/sun.security.action=ALL-UNNAMED,
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED,
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED
</argLine>
</plugin>
to the maven-surefire-plugin
Upvotes: 0
Reputation: 1
Just added this one line in Docker file and it fixed the error
ENV JDK_JAVA_OPTIONS="--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"
Upvotes: 0
Reputation: 415
Maybe not the best news, but for my use case the JVM options solution was not possible. In my case, I needed to run my Spark app on Azure's HDInsight (Their implementation of Spark).
Azure's HDInsight does not allow (AFAIK) to set any JVM options. My only solution was to build using Java 11.
In total, I am using:
Scala 2.13.1
Spark 3.5.1 local builds
Spark 3.3.0 (HDI 5.1) Azure
My local SDK is Amazon Corretto 11.0.24 (corretto-11) downloaded by Intellij
Upvotes: 0
Reputation: 1
Update launch.json
:
launch.json
file in your Visual Studio ).com.spark.Main
class."vmArgs": ["--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"]
example:
{
"version": "0.2.0",
"configurations": [
{
"type": "java",
"name": "Main",
"request": "launch",
"mainClass": "com.spark.Main",
"projectName": "example",
"vmArgs": ["--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"]
}
]
}
This solved the issue while running in the context of VS Code. But the jar files weren't running as expected.
Upvotes: 0
Reputation: 31
In case anyone else has the same problem try adding the list of opens from chehsunliu's answer (tweaked for groovy):
def sparkJava17CompatibleJvmArgs = [
"--add-opens=java.base/java.lang=ALL-UNNAMED",
"--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
"--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
"--add-opens=java.base/java.io=ALL-UNNAMED",
"--add-opens=java.base/java.net=ALL-UNNAMED",
"--add-opens=java.base/java.nio=ALL-UNNAMED",
"--add-opens=java.base/java.util=ALL-UNNAMED",
"--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
"--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
"--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
"--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
"--add-opens=java.base/sun.security.action=ALL-UNNAMED",
"--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
"--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
]
application {
// Define the main class for the application.
mainClass = 'com.cool.App'
applicationDefaultJvmArgs = sparkJava17CompatibleJvmArgs
}
Upvotes: 3
Reputation: 41
I saw this while trying to set up and run a job against a spark standalone cluster in docker.
I tried all possible combinations of passing the --add-opens
directives via spark.driver.defaultJavaOptions
, spark.driver.extraJavaOptions
, spark.executor.defaultJavaOptions
, spark.executor.extraJavaOptions
.
None worked. My guess is that there isn't a mechanism to decorate the spark-submit Java entrypoint with these options.
In the end I did this in the base docker image, which solved the problem:
ENV JDK_JAVA_OPTIONS='--add-opens=java.base/sun.nio.ch=ALL-UNNAMED ...'
JVM (-D...) properties can be passed to the job via: spark.driver.extraJavaOptions
. No idea why JVM options are not picked up at this phase.
Upvotes: 2
Reputation: 11
From https://blog.jdriven.com/2023/03/mastering-maven-setting-default-jvm-options-using-jvm-config/
Simple fix that worked for me:
--add-exports java.base/sun.nio.ch=ALL-UNNAMED
Upvotes: 1
Reputation: 31
I have tried this using JDK 21 and Spark 3.5.0
<properties>
<spark.version>3.5.0</spark.version>
<scala.binary.version>2.12</scala.binary.version>
<maven.compiler.source>21</maven.compiler.source>
<maven.compiler.target>21</maven.compiler.target>
</properties>
Add below line as VM Option in IntelliJ Idea
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED
And It Works!
Upvotes: 3
Reputation: 49804
These three methods work for me on a project using
export JAVA_OPTS='--add-exports java.base/sun.nio.ch=ALL-UNNAMED'
sbt run
Create a .jvmopts file in your project folder, with content:
--add-exports java.base/sun.nio.ch=ALL-UNNAMED
Then you can run
sbt run
If you are using IntelliJ IDEA, this is based on @Anil Reddaboina's answer, and thanks!
This adds more info as I don't have that "VM Options" field by default.
Follow this:
Then you should be able to add --add-exports java.base/sun.nio.ch=ALL-UNNAMED
to "VM Options" field.
or add fully necessary VM Options arguments:
--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED
Upvotes: 23
Reputation: 1884
For those using Gradle to run unit tests for Spark, apply this in build.gradle.kts
:
tasks.test {
useJUnitPlatform()
val sparkJava17CompatibleJvmArgs = listOf(
"--add-opens=java.base/java.lang=ALL-UNNAMED",
"--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
"--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
"--add-opens=java.base/java.io=ALL-UNNAMED",
"--add-opens=java.base/java.net=ALL-UNNAMED",
"--add-opens=java.base/java.nio=ALL-UNNAMED",
"--add-opens=java.base/java.util=ALL-UNNAMED",
"--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
"--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
"--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
"--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
"--add-opens=java.base/sun.security.action=ALL-UNNAMED",
"--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
"--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED",
)
jvmArgs = sparkJava17CompatibleJvmArgs
}
Upvotes: 4
Reputation: 547
simply upgrade to spark 3.3.2 solved my problem
I use Java 17 and pyspark in the command line.
Upvotes: 2
Reputation: 601
Following step helped me to unblock the issue.
If you are running the application from IDE (intelliJ IDEA) follow the below instructions.
Add the JVM option "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"
source: https://arrow.apache.org/docs/java/install.html#java-compatibility
Upvotes: 57
Reputation: 129
Add this as explicit dependency in Pom.xml file. Don't change version other than 3.0.16
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
<version>3.0.16</version>
</dependency>
and then add the command line arguments. If you use VS code, add
"vmArgs": "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"
in configurations section in launch.json
file under .vscode
folder in your project.
Upvotes: 3
Reputation: 45321
You could use JDK 8. You maybe should really.
But if you can't you might try adding to your build.sbt
file these java options. For me they were needed for tests so I put them into:
val projectSettings = Seq(
...
Test / javaOptions ++= Seq(
"base/java.lang", "base/java.lang.invoke", "base/java.lang.reflect", "base/java.io", "base/java.net", "base/java.nio",
"base/java.util", "base/java.util.concurrent", "base/java.util.concurrent.atomic",
"base/sun.nio.ch", "base/sun.nio.cs", "base/sun.security.action",
"base/sun.util.calendar", "security.jgss/sun.security.krb5",
).map("--add-opens=java." + _ + "=ALL-UNNAMED"),
...
Upvotes: 4
Reputation: 18106
A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.
Please, consider adding the appropriate Java Virtual Machine command-line options.
The exact way to add them depends on how you run the program: by using a command line, an IDE, etc.
The command-line options have been taken from the JavaModuleOptions
class: spark/JavaModuleOptions.java at v3.3.0 · apache/spark.
For example, to run the program (the .jar
file) by using the command line:
java \
--add-opens=java.base/java.lang=ALL-UNNAMED \
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
--add-opens=java.base/java.io=ALL-UNNAMED \
--add-opens=java.base/java.net=ALL-UNNAMED \
--add-opens=java.base/java.nio=ALL-UNNAMED \
--add-opens=java.base/java.util=ALL-UNNAMED \
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED \
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED \
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED \
--add-opens=java.base/sun.security.action=ALL-UNNAMED \
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED \
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED \
-jar <JAR_FILE_PATH>
References:
Upvotes: 44