Arno
Arno

Reputation: 101

SecretManagerClient.create() raise NoSuchMethodError io.grpc.MethodDescriptor$Marshaller with Scala on Dataproc Batch

I'm trying to read from google secretmanager. I deploy my scala/spark as Fat JAR on dataproc with SBT.

When I call :

val client: SecretManagerServiceClient = SecretManagerServiceClient.create()

I Have the following error on dataproc :

Using the default container image
Waiting for container log creation
PYSPARK_PYTHON=/opt/dataproc/conda/bin/python
Generating /home/spark/.pip/pip.conf
Configuring index-url as 'https://europe-python.pkg.dev/artifact-registry-python-cache/virtual-python/simple/'
JAVA_HOME=/usr/lib/jvm/temurin-17-jdk-amd64
SPARK_EXTRA_CLASSPATH=
:: loading settings :: file = /etc/spark/conf/ivysettings.xml
24/08/30 13:35:09 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-google-hadoop-file-system.properties,hadoop-metrics2.properties
24/08/30 13:35:09 INFO MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
24/08/30 13:35:09 INFO MetricsSystemImpl: google-hadoop-file-system metrics system started
Exception in thread "main" java.lang.NoSuchMethodError: 'io.grpc.MethodDescriptor$Marshaller io.grpc.protobuf.ProtoUtils.marshaller(repackaged.com.google.protobuf.Message)'
        at com.google.cloud.secretmanager.v1.stub.GrpcSecretManagerServiceStub.<clinit>(GrpcSecretManagerServiceStub.java:72)
        at com.google.cloud.secretmanager.v1.stub.SecretManagerServiceStubSettings.createStub(SecretManagerServiceStubSettings.java:350)
        at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.<init>(SecretManagerServiceClient.java:455)
        at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.create(SecretManagerServiceClient.java:437)
        at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.create(SecretManagerServiceClient.java:428)
        at fr.mycomp.graphuser.GraphUserApp$.getCredentials(GraphUserApp.scala:30)
        at fr.mycomp.graphuser.GraphUserApp$.delayedEndpoint$fr$mycomp$graphuser$GraphUserApp$1(GraphUserApp.scala:95)
        at fr.mycomp.graphuser.GraphUserApp$delayedInit$body.apply(GraphUserApp.scala:15)
        at scala.Function0.apply$mcV$sp(Function0.scala:39)
        at scala.Function0.apply$mcV$sp$(Function0.scala:39)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
        at scala.App.$anonfun$main$1$adapted(App.scala:80)
        at scala.collection.immutable.List.foreach(List.scala:431)
        at scala.App.main(App.scala:80)
        at scala.App.main$(App.scala:78)
        at fr.mycomp.graphuser.GraphUserApp$.main(GraphUserApp.scala:15)
        at fr.mycomp.graphuser.GraphUserApp.main(GraphUserApp.scala)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1032)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1124)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1133)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
ERROR: (gcloud.dataproc.batches.submit.spark) Batch job is FAILED. Detail: Job failed with message [Exception in thread "main" java.lang.NoSuchMethodError: 'io.grpc.MethodDescriptor$Marshaller io.grpc.protobuf.ProtoUtils.marshaller(repackaged.com.google.protobuf.Message)']

Here my build.sbt:

val sparkVersion = settingKey[String]("Spark version")

lazy val root = (project in file("."))
  .settings(
    inThisBuild(List(
      organization := "fr.mycomp",
      scalaVersion := "2.12.13"
    )),
    name := "graphUser",
    version := "0.0.1",

    sparkVersion := "3.5.0",

    javacOptions ++= Seq("-source", "1.8", "-target", "1.8"),
    javaOptions ++= Seq("-Xms512M", "-Xmx2048M"),
    scalacOptions ++= Seq("-deprecation", "-unchecked"),
    parallelExecution in Test := false,
    fork := true,

    coverageHighlighting := true,

    libraryDependencies ++= Seq(
      "org.apache.spark" %% "spark-core" % sparkVersion.value exclude("com.google.protobuf","protobuf-java"),
      "org.apache.spark" %% "spark-sql" % sparkVersion.value exclude("com.google.protobuf","protobuf-java"),
      "org.apache.spark" %% "spark-graphx" % sparkVersion.value exclude("com.google.protobuf","protobuf-java"),

      // Snowflake Connector for Spark
      "net.snowflake" % "spark-snowflake_2.12" % "3.0.0",
      "net.snowflake" % "snowflake-jdbc" % "3.17.0",

      // https://storage.googleapis.com/cloud-opensource-java-dashboard/com.google.cloud/libraries-bom/26.45.0/index.html
      "com.google.cloud" % "google-cloud-storage" % "2.40.1",
      "com.google.cloud" % "google-cloud-secretmanager" % "2.48.0",
      "com.google.protobuf" % "protobuf-java" % "3.25.4",
      "com.google.protobuf" % "protobuf-java-util" % "3.25.4",
      "io.grpc" % "grpc-all" % "1.66.0",
      "io.grpc" % "grpc-protobuf" % "1.66.0",
      "io.grpc" % "grpc-okhttp" % "1.66.0",
      "io.grpc" % "grpc-protobuf-lite" % "1.66.0",
      "io.grpc" % "grpc-stub" % "1.66.0",

      // PureConfig for configuration management
      "com.github.pureconfig" %% "pureconfig" % "0.17.6",

      // SLF4J logging dependencies
      "org.slf4j" % "slf4j-api" % "2.0.9",
      "org.slf4j" % "slf4j-log4j12" % "2.0.9",

      // Log4J logging dependencies
      "org.apache.logging.log4j" % "log4j-api" % "2.23.1",
      "org.apache.logging.log4j" % "log4j-core" % "2.23.1",

      // JSON4S for JSON parsing
      "org.json4s" %% "json4s-native" % "3.6.6",
      "org.json4s" %% "json4s-jackson" % "3.6.6",
      "org.json4s" %% "json4s-core" % "3.6.6",
      "org.json4s" %% "json4s-ast" % "3.6.6",
      "org.json4s" %% "json4s-scalap" % "3.6.6",

      // Testing dependencies
      "org.scalatest" %% "scalatest" % "3.2.19" % Test,
      "org.scalacheck" %% "scalacheck" % "1.18.0" % Test,
      "com.holdenkarau" %% "spark-testing-base" % s"${sparkVersion.value}_1.5.3" % Test
    ),

    // Configure the run task
    run in Compile := Defaults.runTask(fullClasspath in Compile, mainClass in (Compile, run), runner in (Compile, run)).evaluated,

    // Additional build settings
    scalacOptions ++= Seq("-deprecation", "-unchecked"),
    pomIncludeRepository := { x => false },

    // Repositories for dependencies
    resolvers ++= Seq(
      "sonatype-releases" at "https://oss.sonatype.org/content/repositories/releases/",
      "Typesafe repository" at "https://repo.typesafe.com/typesafe/releases/",
      "Second Typesafe repo" at "https://repo.typesafe.com/typesafe/maven-releases/",
      "Spark Packages Repo" at "https://repos.spark-packages.org/",
      "Maven Central" at "https://repo1.maven.org/maven2/",
      Resolver.sonatypeRepo("public")
    ),

    // Publishing settings
    publishTo := {
      val nexus = "https://oss.sonatype.org/"
      if (isSnapshot.value)
        Some("snapshots" at nexus + "content/repositories/snapshots")
      else
        Some("releases"  at nexus + "service/local/staging/deploy/maven2")
    }
  )

import sbtassembly.AssemblyPlugin.autoImport._

// Strategy for handling conflicts in META-INF during assembly
assemblyMergeStrategy in assembly := {
  case PathList("META-INF", xs @ _*) => MergeStrategy.discard
  case x => MergeStrategy.first
}

// Shade rules to prevent conflicts in shaded libraries
assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("com.google.common.**" -> "repackaged.com.google.common.@1").inAll,
  ShadeRule.rename("com.google.protobuf.**" -> "repackaged.com.google.protobuf.@1").inAll,
//  ShadeRule.rename("io.grpc.**" -> "repackaged.io.grpc.@1").inAll
)

I try to follow the dependence on BOM google, I try also to switch to maven but still the same problem...

I use this command to launch my app :

gcloud dataproc batches submit spark --batch=graphuser-$(date '+%Y%m%d')-$(date +%s) --region=europe-west4 --subnet=dataproc-subnet-eu-west4 --version=1.2 --properties=spark.dataproc.scaling.version=2,spark.dynamicAllocation.enabled=true,spark.dynamicAllocation.initialExecutors=4,spark.dynamicAllocation.minExecutors=4,spark.sql.shuffle.partitions=500,spark.executor.memory=12g,spark.dynamicAllocation.executorAllocationRatio=0.5,spark.dynamicAllocation.maxExecutors=100,spark.driver.memory=9g --ttl=12h --jar=gs://dataproc-data-eng/user_ids/graphUser-assembly-0.0.1.jar

Upvotes: 1

Views: 81

Answers (0)

Related Questions