Garret Wilson
Garret Wilson

Reputation: 21356

Apache Spark 3.3.0 breaks on Java 17 with "cannot access class sun.nio.ch.DirectBuffer"

A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.

According to the Spark overview, Spark works with Java 17. I'm using Temurin-17.0.4+8 (build 17.0.4+8) on Windows 10, including Spark 3.3.0 in Maven like this:

<scala.version>2.13</scala.version>
<spark.version>3.3.0</spark.version>
...
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core_${scala.version}</artifactId>
  <version>${spark.version}</version>
</dependency>

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-sql_${scala.version}</artifactId>
  <version>${spark.version}</version>
</dependency>

I try to run a simple program:

final SparkSession spark = SparkSession.builder().appName("Foo Bar").master("local").getOrCreate();
final Dataset<Row> df = spark.read().format("csv").option("header", "false").load("/path/to/file.csv");
df.show(5);

That breaks all over the place:

Caused by: java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x59d016c9) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x59d016c9
    at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala:213)
    at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:114)
    at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:353)
    at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:290)
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:339)
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:464)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
    at scala.Option.getOrElse(Option.scala:201)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)

Spark is obviously doing things one is not supposed to do in Java 17.

Disappointing. How do I get around this?

Upvotes: 72

Views: 71881

Answers (15)

Soundararajan
Soundararajan

Reputation: 2194

For those who get this error when running tests using maven, add the below snippet to maven pom.xml.

<plugin>
   <groupId>org.apache.maven.plugins</groupId>
   <artifactId>maven-surefire-plugin</artifactId>
   <argLine>        
        --add-opens=java.base/java.lang=ALL-UNNAMED,
        --add-opens=java.base/java.lang.invoke=ALL-UNNAMED,
        --add-opens=java.base/java.lang.reflect=ALL-UNNAMED,
        --add-opens=java.base/java.io=ALL-UNNAMED,
        --add-opens=java.base/java.net=ALL-UNNAMED,
        --add-opens=java.base/java.nio=ALL-UNNAMED,
        --add-opens=java.base/java.util=ALL-UNNAMED,
        --add-opens=java.base/java.util.concurrent=ALL-UNNAMED,
        --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED,
        --add-opens=java.base/sun.nio.ch=ALL-UNNAMED,
        --add-opens=java.base/sun.nio.cs=ALL-UNNAMED,
        --add-opens=java.base/sun.security.action=ALL-UNNAMED,
        --add-opens=java.base/sun.util.calendar=ALL-UNNAMED,
        --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED
    </argLine> 
</plugin>

to the maven-surefire-plugin

Upvotes: 0

dshahi
dshahi

Reputation: 1

Just added this one line in Docker file and it fixed the error

ENV JDK_JAVA_OPTIONS="--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"

Upvotes: 0

Michael Schmidt
Michael Schmidt

Reputation: 415

Maybe not the best news, but for my use case the JVM options solution was not possible. In my case, I needed to run my Spark app on Azure's HDInsight (Their implementation of Spark).

Azure's HDInsight does not allow (AFAIK) to set any JVM options. My only solution was to build using Java 11.

In total, I am using:

Scala 2.13.1
Spark 3.5.1 local builds
Spark 3.3.0 (HDI 5.1) Azure

My local SDK is Amazon Corretto 11.0.24 (corretto-11) downloaded by Intellij

Upvotes: 0

Sanjanaa Suresh
Sanjanaa Suresh

Reputation: 1

Update launch.json:

  1. Navigate to the launch.json file in your Visual Studio ).
  2. Locate or create the configuration you use to launch your Spark application. It might have a name like "Main" and target the com.spark.Main class.
  3. Add the following property within the configuration object:
"vmArgs": ["--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"]

example:

{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "java",
      "name": "Main",
      "request": "launch",
      "mainClass": "com.spark.Main",
      "projectName": "example",
      "vmArgs": ["--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"]
    }
  ]
}

This solved the issue while running in the context of VS Code. But the jar files weren't running as expected.

Upvotes: 0

rsandor
rsandor

Reputation: 31

In case anyone else has the same problem try adding the list of opens from chehsunliu's answer (tweaked for groovy):

def sparkJava17CompatibleJvmArgs = [
        "--add-opens=java.base/java.lang=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
        "--add-opens=java.base/java.io=ALL-UNNAMED",
        "--add-opens=java.base/java.net=ALL-UNNAMED",
        "--add-opens=java.base/java.nio=ALL-UNNAMED",
        "--add-opens=java.base/java.util=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
        "--add-opens=java.base/sun.security.action=ALL-UNNAMED",
        "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
]

application {
    // Define the main class for the application.
    mainClass = 'com.cool.App'
    applicationDefaultJvmArgs = sparkJava17CompatibleJvmArgs
}

Upvotes: 3

Alastair Knowles
Alastair Knowles

Reputation: 41

I saw this while trying to set up and run a job against a spark standalone cluster in docker.

I tried all possible combinations of passing the --add-opens directives via spark.driver.defaultJavaOptions, spark.driver.extraJavaOptions, spark.executor.defaultJavaOptions, spark.executor.extraJavaOptions.

None worked. My guess is that there isn't a mechanism to decorate the spark-submit Java entrypoint with these options.

In the end I did this in the base docker image, which solved the problem:

ENV JDK_JAVA_OPTIONS='--add-opens=java.base/sun.nio.ch=ALL-UNNAMED ...'

JVM (-D...) properties can be passed to the job via: spark.driver.extraJavaOptions. No idea why JVM options are not picked up at this phase.

Upvotes: 2

trysofter
trysofter

Reputation: 11

From https://blog.jdriven.com/2023/03/mastering-maven-setting-default-jvm-options-using-jvm-config/

Simple fix that worked for me:

  1. Add a .mvn folder to the root directory of your project
  2. Create a jvm.config file and add --add-exports java.base/sun.nio.ch=ALL-UNNAMED

file structure

Upvotes: 1

Kaustubha M
Kaustubha M

Reputation: 31

I have tried this using JDK 21 and Spark 3.5.0

  <properties>
        <spark.version>3.5.0</spark.version>
        <scala.binary.version>2.12</scala.binary.version>
        <maven.compiler.source>21</maven.compiler.source>
        <maven.compiler.target>21</maven.compiler.target>
  </properties>

Add below line as VM Option in IntelliJ Idea

--add-opens=java.base/sun.nio.ch=ALL-UNNAMED

And It Works!

Upvotes: 3

Hongbo Miao
Hongbo Miao

Reputation: 49804

These three methods work for me on a project using

  • Spark 3.3.2
  • Scala 2.13.10
  • Java 17.0.6 (my project is small, it even works on Java 19.0.1. However, if your project is big, it is better to wait Spark officially supports it)

Method 1

export JAVA_OPTS='--add-exports java.base/sun.nio.ch=ALL-UNNAMED'
sbt run

Method 2

Create a .jvmopts file in your project folder, with content:

--add-exports java.base/sun.nio.ch=ALL-UNNAMED

Then you can run

sbt run

Method 3

If you are using IntelliJ IDEA, this is based on @Anil Reddaboina's answer, and thanks!

This adds more info as I don't have that "VM Options" field by default.

Follow this:

enter image description here

Then you should be able to add --add-exports java.base/sun.nio.ch=ALL-UNNAMED to "VM Options" field.

or add fully necessary VM Options arguments:

--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED

enter image description here

Upvotes: 23

CH Liu
CH Liu

Reputation: 1884

For those using Gradle to run unit tests for Spark, apply this in build.gradle.kts:

tasks.test {
    useJUnitPlatform()
 
    val sparkJava17CompatibleJvmArgs = listOf(
        "--add-opens=java.base/java.lang=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
        "--add-opens=java.base/java.io=ALL-UNNAMED",
        "--add-opens=java.base/java.net=ALL-UNNAMED",
        "--add-opens=java.base/java.nio=ALL-UNNAMED",
        "--add-opens=java.base/java.util=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
        "--add-opens=java.base/sun.security.action=ALL-UNNAMED",
        "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED",
    )
    jvmArgs = sparkJava17CompatibleJvmArgs
}

Upvotes: 4

Lynne
Lynne

Reputation: 547

simply upgrade to spark 3.3.2 solved my problem

I use Java 17 and pyspark in the command line.

Upvotes: 2

Anil Reddaboina
Anil Reddaboina

Reputation: 601

Following step helped me to unblock the issue.

If you are running the application from IDE (intelliJ IDEA) follow the below instructions.

Add the JVM option "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"

enter image description here

source: https://arrow.apache.org/docs/java/install.html#java-compatibility

Upvotes: 57

balu mahendran
balu mahendran

Reputation: 129

Add this as explicit dependency in Pom.xml file. Don't change version other than 3.0.16

<dependency>
    <groupId>org.codehaus.janino</groupId>
    <artifactId>janino</artifactId>
    <version>3.0.16</version>
</dependency>

and then add the command line arguments. If you use VS code, add

"vmArgs": "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"

in configurations section in launch.json file under .vscode folder in your project.

Upvotes: 3

dlamblin
dlamblin

Reputation: 45321

You could use JDK 8. You maybe should really.

But if you can't you might try adding to your build.sbt file these java options. For me they were needed for tests so I put them into:

val projectSettings = Seq(
...
  Test / javaOptions ++= Seq(
    "base/java.lang", "base/java.lang.invoke", "base/java.lang.reflect", "base/java.io", "base/java.net", "base/java.nio",
    "base/java.util", "base/java.util.concurrent", "base/java.util.concurrent.atomic",
    "base/sun.nio.ch", "base/sun.nio.cs", "base/sun.security.action",
    "base/sun.util.calendar", "security.jgss/sun.security.krb5",
  ).map("--add-opens=java." + _ + "=ALL-UNNAMED"),
...

Upvotes: 4

Solution

A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.

Please, consider adding the appropriate Java Virtual Machine command-line options.
The exact way to add them depends on how you run the program: by using a command line, an IDE, etc.

Examples

The command-line options have been taken from the JavaModuleOptions class: spark/JavaModuleOptions.java at v3.3.0 · apache/spark.

Command line

For example, to run the program (the .jar file) by using the command line:

java \
    --add-opens=java.base/java.lang=ALL-UNNAMED \
    --add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
    --add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
    --add-opens=java.base/java.io=ALL-UNNAMED \
    --add-opens=java.base/java.net=ALL-UNNAMED \
    --add-opens=java.base/java.nio=ALL-UNNAMED \
    --add-opens=java.base/java.util=ALL-UNNAMED \
    --add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
    --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED \
    --add-opens=java.base/sun.nio.ch=ALL-UNNAMED \
    --add-opens=java.base/sun.nio.cs=ALL-UNNAMED \
    --add-opens=java.base/sun.security.action=ALL-UNNAMED \
    --add-opens=java.base/sun.util.calendar=ALL-UNNAMED \
    --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED \
    -jar <JAR_FILE_PATH>

IDE: IntelliJ IDEA

References:

References

Upvotes: 44

Related Questions