Gokulraj
Gokulraj

Reputation: 520

Elasticsearch support for spark 2.4.2 with scala 2.12

I'm not able to find any ES 6.7.1 supporting jar for spark 2.4.2 with scala 2.12 In maven repo only scala 2.11 and 2.10 is supported for the jar.

<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch-spark-20_2.11</artifactId>
    <version>6.7.1</version>
</dependency>

For my application we are using spark 2.4.2 which supports only scala 2.12 version. Following is the error shown when I try to run with "elasticsearch-spark-20_2.11" jar

StreamingExecutionRelation KafkaV2[Subscribe[test_topic]], [key#7, value#8, topic#9, partition#10, offset#11L, timestamp#12, timestampType#13]

        at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:302)
        at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:193)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
        at org.elasticsearch.spark.sql.DataFrameValueWriter.writeStruct(DataFrameValueWriter.scala:78)
        at org.elasticsearch.spark.sql.DataFrameValueWriter.write(DataFrameValueWriter.scala:70)
        at org.elasticsearch.spark.sql.DataFrameValueWriter.write(DataFrameValueWriter.scala:53)
        at org.elasticsearch.hadoop.serialization.builder.ContentBuilder.value(ContentBuilder.java:53)
        at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.doWriteObject(TemplatedBulk.java:71)
        at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.write(TemplatedBulk.java:58)
        at org.elasticsearch.hadoop.serialization.bulk.BulkEntryWriter.writeBulkEntry(BulkEntryWriter.java:68)
        at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:170)
        at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:74)
        at org.elasticsearch.spark.sql.streaming.EsStreamQueryWriter.run(EsStreamQueryWriter.scala:41)
        at org.elasticsearch.spark.sql.streaming.EsSparkSqlStreamingSink$$anonfun$addBatch$2$$anonfun$2.apply(EsSparkSqlStreamingSink.scala:52)
        at org.elasticsearch.spark.sql.streaming.EsSparkSqlStreamingSink$$anonfun$addBatch$2$$anonfun$2.apply(EsSparkSqlStreamingSink.scala:51)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Upvotes: 3

Views: 2376

Answers (2)

Powers
Powers

Reputation: 19328

The last Maven release for that project was in November 2016 and Scala 2.12 is still not supported.

This is unfortunate because Spark 3 requires Scala 2.12, so if your project depends on elasticsearch-spark, then you can't upgrade to Spark 3.

elasticsearch-hadoop claims to support Spark and looks to be actively maintained, so that might be your best bet.

As mentioned in this post watch out for what your depend on in the Spark / Scala world because dependencies are often abandoned, leaving users in difficult situations.

Upvotes: 1

baitmbarek
baitmbarek

Reputation: 2518

Sorry for the delay,

We can't use Elasticsearch Spark / Elasticsearch Hadoop libraries yet with Scala 2.12. There exists an opened merge request (https://github.com/elastic/elasticsearch-hadoop/pull/1308) which is pending, waiting for tests to pass.

You either have to downgrade your project to use Scala 2.11 or have to wait for the libraries to be released on Scala 2.12

Upvotes: 2

Related Questions