Hongbo Miao
Hongbo Miao

Reputation: 49974

Spark app in Scala 2.13 failed with: Exception in thread "main" java.lang.NoSuchMethodError

I have a very simple Spark app in Scala.

I am using

src/main/scala/FindRetiredPeople.scala

import org.apache.spark.sql.{DataFrame, SparkSession}

object FindRetiredPeople {
  def main(args: Array[String]): Unit = {
    val people = Seq(
      (1, "Alice", 25),
      (2, "Bob", 30),
      (3, "Charlie", 80),
      (4, "Dave", 40),
      (5, "Eve", 45)
    )

    val spark: SparkSession = SparkSession.builder()
      .master("local[*]")
      .appName("find-retired-people")
      .config("spark.ui.port", "4040")
      .getOrCreate()

    import spark.implicits._
    val df: DataFrame = people.toDF("id", "name", "age")
    df.createOrReplaceTempView("people")

    val retiredPeople: DataFrame = spark.sql("SELECT name, age FROM people WHERE age >= 67")
    retiredPeople.show()

    spark.stop()
  }
}

Scala 2.12 (Succeed both for sbt run and spark-submit)

When I am using Scala 2.12

build.sbt

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.12.17"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.3.2",
  "org.apache.spark" %% "spark-sql" % "3.3.2",
  "org.apache.spark" %% "spark-streaming" % "3.3.2",
)

I can successfully run

sbt package
sbt run

and also

spark-submit \
    --class=FindRetiredPeople \
    --master="local[*]" \
    target/scala-2.12/findretiredpeople_2.12-1.0.jar

spark-submit will print log:

23/03/21 01:59:23 INFO SparkContext: Running Spark version 3.3.2
23/03/21 01:59:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/03/21 01:59:23 INFO ResourceUtils: ==============================================================
23/03/21 01:59:23 INFO ResourceUtils: No custom resources configured for spark.driver.
23/03/21 01:59:23 INFO ResourceUtils: ==============================================================
23/03/21 01:59:23 INFO SparkContext: Submitted application: find-retired-people
23/03/21 01:59:23 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
23/03/21 01:59:23 INFO ResourceProfile: Limiting resource is cpu
23/03/21 01:59:23 INFO ResourceProfileManager: Added ResourceProfile id: 0
23/03/21 01:59:23 INFO SecurityManager: Changing view acls to: hongbo-miao
23/03/21 01:59:23 INFO SecurityManager: Changing modify acls to: hongbo-miao
23/03/21 01:59:23 INFO SecurityManager: Changing view acls groups to: 
23/03/21 01:59:23 INFO SecurityManager: Changing modify acls groups to: 
23/03/21 01:59:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hongbo-miao); groups with view permissions: Set(); users  with modify permissions: Set(hongbo-miao); groups with modify permissions: Set()
23/03/21 01:59:23 INFO Utils: Successfully started service 'sparkDriver' on port 57085.
23/03/21 01:59:23 INFO SparkEnv: Registering MapOutputTracker
23/03/21 01:59:23 INFO SparkEnv: Registering BlockManagerMaster
23/03/21 01:59:23 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
23/03/21 01:59:23 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
23/03/21 01:59:23 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
23/03/21 01:59:23 INFO DiskBlockManager: Created local directory at /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/blockmgr-719fedfe-99ac-4399-aaf2-c561f236c487
23/03/21 01:59:23 INFO MemoryStore: MemoryStore started with capacity 434.4 MiB
23/03/21 01:59:23 INFO SparkEnv: Registering OutputCommitCoordinator
23/03/21 01:59:23 INFO Utils: Successfully started service 'SparkUI' on port 4040.
23/03/21 01:59:23 INFO SparkContext: Added JAR file:/Users/hongbo-miao/find-retired-people-scala/target/scala-2.12/findretiredpeople_2.12-1.0.jar at spark://10.37.129.2:57085/jars/findretiredpeople_2.12-1.0.jar with timestamp 1679389163209
23/03/21 01:59:23 INFO Executor: Starting executor ID driver on host 10.37.129.2
23/03/21 01:59:23 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
23/03/21 01:59:23 INFO Executor: Fetching spark://10.37.129.2:57085/jars/findretiredpeople_2.12-1.0.jar with timestamp 1679389163209
23/03/21 01:59:23 INFO TransportClientFactory: Successfully created connection to /10.37.129.2:57085 after 14 ms (0 ms spent in bootstraps)
23/03/21 01:59:23 INFO Utils: Fetching spark://10.37.129.2:57085/jars/findretiredpeople_2.12-1.0.jar to /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-5a9b2fb3-361d-4e3a-9acf-d6292fa0ead3/userFiles-5001866c-99f0-47e2-a6a6-d84091d20a8f/fetchFileTemp16516278909384835567.tmp
23/03/21 01:59:23 INFO Executor: Adding file:/private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-5a9b2fb3-361d-4e3a-9acf-d6292fa0ead3/userFiles-5001866c-99f0-47e2-a6a6-d84091d20a8f/findretiredpeople_2.12-1.0.jar to class loader
23/03/21 01:59:23 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57087.
23/03/21 01:59:23 INFO NettyBlockTransferService: Server created on 10.37.129.2:57087
23/03/21 01:59:23 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
23/03/21 01:59:23 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.37.129.2, 57087, None)
23/03/21 01:59:23 INFO BlockManagerMasterEndpoint: Registering block manager 10.37.129.2:57087 with 434.4 MiB RAM, BlockManagerId(driver, 10.37.129.2, 57087, None)
23/03/21 01:59:23 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.37.129.2, 57087, None)
23/03/21 01:59:23 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.37.129.2, 57087, None)
23/03/21 01:59:24 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.
23/03/21 01:59:24 INFO SharedState: Warehouse path is 'file:/Users/hongbo-miao/find-retired-people-scala/spark-warehouse'.
23/03/21 01:59:25 INFO CodeGenerator: Code generated in 77.321 ms
23/03/21 01:59:25 INFO CodeGenerator: Code generated in 3.580958 ms
23/03/21 01:59:25 INFO CodeGenerator: Code generated in 7.1175 ms
23/03/21 01:59:26 INFO CodeGenerator: Code generated in 6.165709 ms
+-------+---+
|   name|age|
+-------+---+
|Charlie| 80|
+-------+---+

23/03/21 01:59:26 INFO SparkUI: Stopped Spark web UI at http://10.37.129.2:4040
23/03/21 01:59:26 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/03/21 01:59:26 INFO MemoryStore: MemoryStore cleared
23/03/21 01:59:26 INFO BlockManager: BlockManager stopped
23/03/21 01:59:26 INFO BlockManagerMaster: BlockManagerMaster stopped
23/03/21 01:59:26 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/03/21 01:59:26 INFO SparkContext: Successfully stopped SparkContext
23/03/21 01:59:26 INFO ShutdownHookManager: Shutdown hook called
23/03/21 01:59:26 INFO ShutdownHookManager: Deleting directory /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-3c8e7833-ca25-4178-ad2e-d32b8842ff0a
23/03/21 01:59:26 INFO ShutdownHookManager: Deleting directory /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-5a9b2fb3-361d-4e3a-9acf-d6292fa0ead3

Process finished with exit code 0

Scala 2.13 (Succeed for sbt run, but failed for spark-submit)

However, if I use scala version 2.13:

build.sbt

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.13.10"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.3.2",
  "org.apache.spark" %% "spark-sql" % "3.3.2",
  "org.apache.spark" %% "spark-streaming" % "3.3.2",
)

I can still successfully run

sbt package
sbt run

but this

spark-submit \
    --class=FindRetiredPeople \
    --master="local[*]" \
    target/scala-2.13/findretiredpeople_2.13-1.0.jar

will throw exception:

23/03/21 01:39:31 INFO SparkContext: Running Spark version 3.3.2
23/03/21 01:39:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/03/21 01:39:31 INFO ResourceUtils: ==============================================================
23/03/21 01:39:31 INFO ResourceUtils: No custom resources configured for spark.driver.
23/03/21 01:39:31 INFO ResourceUtils: ==============================================================
23/03/21 01:39:31 INFO SparkContext: Submitted application: find-retired-people
23/03/21 01:39:31 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
23/03/21 01:39:31 INFO ResourceProfile: Limiting resource is cpu
23/03/21 01:39:31 INFO ResourceProfileManager: Added ResourceProfile id: 0
23/03/21 01:39:31 INFO SecurityManager: Changing view acls to: hongbo-miao
23/03/21 01:39:31 INFO SecurityManager: Changing modify acls to: hongbo-miao
23/03/21 01:39:31 INFO SecurityManager: Changing view acls groups to: 
23/03/21 01:39:31 INFO SecurityManager: Changing modify acls groups to: 
23/03/21 01:39:31 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hongbo-miao); groups with view permissions: Set(); users  with modify permissions: Set(hongbo-miao); groups with modify permissions: Set()
23/03/21 01:39:31 INFO Utils: Successfully started service 'sparkDriver' on port 56517.
23/03/21 01:39:31 INFO SparkEnv: Registering MapOutputTracker
23/03/21 01:39:31 INFO SparkEnv: Registering BlockManagerMaster
23/03/21 01:39:31 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
23/03/21 01:39:31 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
23/03/21 01:39:31 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
23/03/21 01:39:31 INFO DiskBlockManager: Created local directory at /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/blockmgr-7a60f8cd-54ff-464c-affc-8d9e90cb233f
23/03/21 01:39:31 INFO MemoryStore: MemoryStore started with capacity 434.4 MiB
23/03/21 01:39:31 INFO SparkEnv: Registering OutputCommitCoordinator
23/03/21 01:39:31 INFO Utils: Successfully started service 'SparkUI' on port 4040.
23/03/21 01:39:31 INFO SparkContext: Added JAR file:/Users/hongbo-miao/find-retired-people-scala/target/scala-2.13/findretiredpeople_2.13-1.0.jar at spark://10.37.129.2:56517/jars/findretiredpeople_2.13-1.0.jar with timestamp 1679387971368
23/03/21 01:39:32 INFO Executor: Starting executor ID driver on host 10.37.129.2
23/03/21 01:39:32 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
23/03/21 01:39:32 INFO Executor: Fetching spark://10.37.129.2:56517/jars/findretiredpeople_2.13-1.0.jar with timestamp 1679387971368
23/03/21 01:39:32 INFO TransportClientFactory: Successfully created connection to /10.37.129.2:56517 after 12 ms (0 ms spent in bootstraps)
23/03/21 01:39:32 INFO Utils: Fetching spark://10.37.129.2:56517/jars/findretiredpeople_2.13-1.0.jar to /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-4ded8a6d-f9f3-4483-a62a-78edee62932d/userFiles-d17d2628-db18-484a-9718-8a44d7292c3b/fetchFileTemp5844915920242526627.tmp
23/03/21 01:39:32 INFO Executor: Adding file:/private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-4ded8a6d-f9f3-4483-a62a-78edee62932d/userFiles-d17d2628-db18-484a-9718-8a44d7292c3b/findretiredpeople_2.13-1.0.jar to class loader
23/03/21 01:39:32 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56519.
23/03/21 01:39:32 INFO NettyBlockTransferService: Server created on 10.37.129.2:56519
23/03/21 01:39:32 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
23/03/21 01:39:32 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.37.129.2, 56519, None)
23/03/21 01:39:32 INFO BlockManagerMasterEndpoint: Registering block manager 10.37.129.2:56519 with 434.4 MiB RAM, BlockManagerId(driver, 10.37.129.2, 56519, None)
23/03/21 01:39:32 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.37.129.2, 56519, None)
23/03/21 01:39:32 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.37.129.2, 56519, None)
23/03/21 01:39:33 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.
23/03/21 01:39:33 INFO SharedState: Warehouse path is 'file:/Users/hongbo-miao/find-retired-people-scala/spark-warehouse'.
Exception in thread "main" java.lang.NoSuchMethodError: 'org.apache.spark.sql.DatasetHolder org.apache.spark.sql.SparkSession$implicits$.localSeqToDatasetHolder(scala.collection.immutable.Seq, org.apache.spark.sql.Encoder)'
    at FindRetiredPeople$.main(FindRetiredPeople.scala:22)
    at FindRetiredPeople.main(FindRetiredPeople.scala)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
    at java.base/java.lang.reflect.Method.invoke(Method.java:578)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
23/03/21 01:39:33 INFO SparkContext: Invoking stop() from shutdown hook
23/03/21 01:39:33 INFO SparkUI: Stopped Spark web UI at http://10.37.129.2:4040
23/03/21 01:39:33 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/03/21 01:39:33 INFO MemoryStore: MemoryStore cleared
23/03/21 01:39:33 INFO BlockManager: BlockManager stopped
23/03/21 01:39:33 INFO BlockManagerMaster: BlockManagerMaster stopped
23/03/21 01:39:33 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/03/21 01:39:33 INFO SparkContext: Successfully stopped SparkContext
23/03/21 01:39:33 INFO ShutdownHookManager: Shutdown hook called
23/03/21 01:39:33 INFO ShutdownHookManager: Deleting directory /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-4ded8a6d-f9f3-4483-a62a-78edee62932d
23/03/21 01:39:33 INFO ShutdownHookManager: Deleting directory /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-8709497e-302d-4e1b-8eb2-2a1b8f77edfe
make: *** [spark-submit-local] Error 1

What does above exception really mean?

I do see Spark officially supports Scala 2.13:

Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+.

When using the Scala API, it is necessary for applications to use the same version of Scala that Spark was compiled for. For example, when using Scala 2.13, use Spark compiled for 2.13, and compile code/applications for Scala 2.13 as well.

Any guide would be appreciate, thanks!

Upvotes: 1

Views: 1770

Answers (1)

Rahul Sahoo
Rahul Sahoo

Reputation: 159

This NoSuchMethodError occurs mostly on code compiled using a different version of scala. If the Scala version used to compile your code is different from the scala version that is used to compile the Spark (scala version used by spark-submit) then there is a chance that you might be having NoSuchMethodError due to that.

Please check both the scala versions. For your case, it should be 2.13 for both.

Also additionally don't include Spark in your compiled jar.

build.sbt

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.13.10"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.3.2" % "provided",
  "org.apache.spark" %% "spark-sql" % "3.3.2" % "provided",
  "org.apache.spark" %% "spark-streaming" % "3.3.2",
)

Do let me know if it helps.

Upvotes: 3

Related Questions