Reputation: 13
I am creating an application in spark. I use avro files in HDFS with Hadoop2. I use maven and I include avro like this :
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<version>1.7.6</version>
<classifier>hadoop2</classifier>
</dependency>
I did a unit test and while I use mvn test, all work. But While I launch with spark submit no ! and I have this mistake :
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, localhost): java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
at org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47)
Can you help me ?
Thank you
Upvotes: 0
Views: 541
Reputation: 13
But it isn't a solution with spark-submit --master yarn-cluster
I have again the same error :
WARN scheduler.TaskSetManager: Lost task 9.1 in stage 0.0 (TID 15, 10.163.34.129): java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected at org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47)
Someone has another idea ?
Upvotes: 0
Reputation: 13
Ok, I fond the solution :D Thanks to http://apache-spark-developers-list.1001551.n3.nabble.com/Fwd-Unable-to-Read-Write-Avro-RDD-on-cluster-td10893.html.
The solution is to add jar in your SPARK_CLASSPATH
export SPARK_CLASSPATH=yourpath/avro-mapred-1.7.7-hadoop2.jar:yourpath/avro-1.7.7.jar
You can download the jar here : http://repo1.maven.org/maven2/org/apache/avro/avro-mapred/1.7.7/
Upvotes: 1