Mujtaba Faizi
Mujtaba Faizi

Reputation: 287

RRunning tests with Spark 3.5.1 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer

I am using scala 2.12.12, sbt 1.6.0, spark 3.5.1 with java 17 to run spark jobs. It appears my tests are failing due to sun.nio.ch. I researched and it seems to be a problem with java 17 version, how can i resolve it?

[info]   java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x7b8a43da) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x7b8a43da
[info]   at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:213)
[info]   at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
[info]   at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:121)
[info]   at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:358)
[info]   at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:295)
[info]   at org.apache.spark.SparkEnv$.create(SparkEnv.scala:344)
[info]   at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:196)
[info]   at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:284)
[info]   at org.apache.spark.SparkContext.<init>(SparkContext.scala:483)
[info]   at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2888)

i tried resolving it by adding javaOptions in build.sbt but the error still re-ocurred:

javaOptions ++= Seq(
  "--illegal-access=permit",
  "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
  "--add-opens=java.base/java.nio=ALL-UNNAMED",
  "--add-opens=java.base/java.lang=ALL-UNNAMED",
  "--add-opens=java.base/java.util=ALL-UNNAMED"
)

Test / javaOptions ++= Seq(
  "--illegal-access=permit",
  "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
  "--add-opens=java.base/java.nio=ALL-UNNAMED",
  "--add-opens=java.base/java.lang=ALL-UNNAMED",
  "--add-opens=java.base/java.util=ALL-UNNAMED"
)

build.sbt

name := "xyz"
ThisBuild / organization := "de.abc"

lazy val scalaV12 = "2.12.12"

lazy val dependencies =
  new {
    val sparkVersion = "3.5.1"
    val sparkBigQueryConnectorVersion = "0.36.4"
    val sparkAvroVersion = "3.5.1"
    val abcCommonsSparkVersion = "1.26"

    val typesafeConfigVersion = "1.3.4"
    val scoptVersion = "4.0.0-RC2"

    val scalaLoggingVersion = "3.9.2"
    val log4jOverSlf4jVersion = "1.7.30"

    val scalatestVersion = "3.0.8"
    val pegdownVersion = "1.6.0"

    val sparkTestsVersion = "1.0.0"

    val abcCommonsSpark = "de.abc.data" %% "commons-spark" % abcCommonsSparkVersion

    val sparkCore = "org.apache.spark" %% "spark-core" % sparkVersion % "provided"
    val sparkSql = "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"
    val sparkHive = "org.apache.spark" %% "spark-hive" % sparkVersion % "provided"
    val sparkBigQueryConnector = "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % sparkBigQueryConnectorVersion % "provided"
    val sparkAvro = "org.apache.spark" %% "spark-avro" % sparkAvroVersion

    val scalatest = "org.scalatest" %% "scalatest" % scalatestVersion
    val scopt = "com.github.scopt" %% "scopt" % scoptVersion
    val typeSafeConfig = "com.typesafe" % "config" % typesafeConfigVersion
    val scalaLogging = "com.typesafe.scala-logging" %% "scala-logging" % scalaLoggingVersion

    val pegdown = "org.pegdown" % "pegdown" % pegdownVersion % "test" // used to generate html report in scalatest
    val sparkTests = "com.github.mrpowers" %% "spark-fast-tests" % sparkTestsVersion  % "test"
  }

Test / fullClasspath := {
  val cp = (Test / fullClasspath).value
  val providedDependencies = update.map(f => f.select(configurationFilter("provided"))).value

  cp filter { f =>
    !providedDependencies.contains(f.data)
  }
}

lazy val global = project
  .in(file("."))
  .settings(
    commonSettings
  )
  .aggregate(
    commonsJobs
  )

lazy val commonsJobs = (project in file("commons-jobs"))
  .withId("commons-jobs")
  .configs(IntegrationTest)
  .disablePlugins(AssemblyPlugin, RevolverPlugin)
  .settings(
    scalaVersion := scalaV12,
    commonSettings,
    name := "commons-jobs",
    libraryDependencies ++= commonJobsDependencies
  )

lazy val commonJobsDependencies = Seq(
  dependencies.sparkCore,
  dependencies.sparkSql,
  dependencies.sparkHive,
  dependencies.sparkBigQueryConnector,
  dependencies.sparkAvro,
  dependencies.typeSafeConfig,
  dependencies.scopt,
  dependencies.scalatest,
  dependencies.sparkTests,
  dependencies.scalaLogging,
  dependencies.abcCommonsSpark
)

lazy val compilerOptions = Seq(
  "-deprecation",
  "-encoding", "UTF-8", // yes, this is 2 args
  "-feature",
  "-language:existentials",
  "-language:higherKinds",
  "-language:implicitConversions",
  "-language:reflectiveCalls",
  "-language:postfixOps",
  "-unchecked",
  "-Xfatal-warnings",
  "-Xlint",
  "-Yno-adapted-args",
  "-Ywarn-unused-import",
  "-Ywarn-dead-code",
  "-Ywarn-numeric-widen",
  "-Ywarn-value-discard",
  "-Xlint:missing-interpolator",
  "-Xfuture"
)

lazy val commonSettings = Seq(
  Compile / scalacOptions ++= compilerOptions,
  resolvers ++= Seq(
    "abc" at "https://nexus.es.ecg.tools/repository/abc-releases/",
    Resolver.bintrayRepo("cakesolutions", "maven")
  )
)

lazy val assemblySettings = Seq(
  assembly / assemblyJarName := name.value + ".jar",
  assembly / assemblyMergeStrategy := {
    case "META-INF/io.netty.versions.properties" => MergeStrategy.last
    case "module-info.class" => MergeStrategy.discard
    case PathList("META-INF", "versions", "9", "module-info.class") => MergeStrategy.discard
    case PathList("org", "apache", "spark", "unused", "UnusedStubClass.class") => MergeStrategy.last
    case x =>
      val oldStrategy = (assembly / assemblyMergeStrategy).value
      oldStrategy(x)
  }

)

I saw a similar situation in these 2 posts but i couldnt apply the solution changes directly onto my build.sbt

Apache Spark 3.3.0 breaks on Java 17 with "cannot access class sun.nio.ch.DirectBuffer"

Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer

Also, if I add this line in intelliJ VM options --add-exports=java.base/sun.nio.ch=ALL-UNNAMED I get a different error:

java.io.InvalidObjectException: ReflectiveOperationException during deserialization

from the code line

spark.read.option("multiLine", value = true).option("mode", "PERMISSIVE").json(path)

Im building the code using git actions. So i need a solution not limited to the intelliJ.

Upvotes: 1

Views: 56

Answers (2)

Mujtaba Faizi
Mujtaba Faizi

Reputation: 287

Ok i was able to resolve it using java 17 by changing some configurations.

I used scala 2.12.15 and updated sparkTestsVersion to 1.1.0 (this helped solve the ReflectiveOperationException)

As for the java options, I didnt find a good way of setting this in build.sbt, so I just added it as a step in git actions as following:

  - name: Set JAVA_OPTS
    if: ${{ inputs.JAVA_VERSION == '17' }}
    run: echo "JAVA_OPTS=--add-exports=java.base/sun.nio.ch=ALL-UNNAMED" >> $GITHUB_ENV

Upvotes: 0

Devyl
Devyl

Reputation: 647

I had the same trip as yours to day, i only use for unit tests, but must be the same idea for classic run (fork + javaOptions)

I found a trick forking the jvm, below sbt config

Test / fork := true,
Test / javaOptions ++= Seq(
  "--add-exports", "java.base/sun.nio.ch=ALL-UNNAMED"
),

Upvotes: 0

Related Questions