Sarthak Sharma
Sarthak Sharma

Reputation: 21

org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus can't be cast to org.apache.spark.sql.execution.datasources.FileStatusWithMetadat

Getting the following error while creating a delta table using scalaspark. _delta_log is getting created at the warehouse but it lands into this error after _delta_log creation

Exception in thread "main" java.lang.ClassCastException: class org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus cannot be cast to class org.apache.spark.sql.execution.datasources.FileStatusWithMetadata (org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus and org.apache.spark.sql.execution.datasources.FileStatusWithMetadata are in unnamed module of loader 'app')

build.sbt

//Using JDK 20

import sbt.project

ThisBuild / version := "0.1.0-SNAPSHOT"

ThisBuild / scalaVersion := "2.12.4"

lazy val root = (project in file("."))
  .settings(
    name := "scala-spark-app2",
    idePackagePrefix := Some("com.scalaspark.app2"),
  )

libraryDependencies += "org.apache.spark" %% "spark-core" % "3.5.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.5.0"
libraryDependencies += "io.delta" %% "delta-core" % "2.4.0"

Main.scala

package com.scalaspark.app2

import org.apache.spark.sql.{Row, SparkSession}

object Main {
  def main(args: Array[String]): Unit = {
    val warehouseDir = "/Users/usr1/app2/scala-spark-app2/data_files/"
    val spark = SparkSession.builder()
      .appName("DeltaTableDetails")
      .config("spark.master", "local")
      .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
      .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
      .config("spark.sql.warehouse.dir", warehouseDir)
      .getOrCreate()

    var query = "CREATE TABLE IF NOT EXISTS T1 (id bigint, data string) USING delta LOCATION '/Users/usr1/scala-spark-app2/data_files/T1'"
    spark.sql(query)
    query = "INSERT INTO T1 VALUES(10,'VAL1')"
    spark.sql(query)
  }
}

Tried altering versions for scala , spark and delta, nothing working out. Using JDK 20 version.

Upvotes: 2

Views: 837

Answers (1)

Stefanie Brenner
Stefanie Brenner

Reputation: 21

So the highest supported Java version at the moment ist Java 17. Make sure you use this version.

I personally used Spark:3.5.1 and it only worked with delta-spark:3.1.0, I got a similar error like you with delta-core:2.4.0.

Info: Since version 3.0.0 delta-core got renamed into delta-spark.

Here are the versions I used and are working:

  • Spark==3.5.1
  • Hadoop==3.3.4
  • Java==17.0.10
  • Scala==2.12.18
  • spark-delta==2.12_3.1.0

I hope that helps.

Upvotes: 2

Related Questions