Reputation: 334
I am trying to run the Apache Beam v2.0.0 word count example on Spark v1.6.x (via Yarn v2.7.3) such that it reads from and writes to HDFS (v2.7.3).
Currently, I submit the job via the following command:
bin/spark-submit --class org.apache.beam.examples.WordCount \
--master yarn --deploy-mode cluster \
test/word-count-beam-1.0-SNAPSHOT.jar \
--inputFile=hdfs://test/input/* \
--output=hdfs://test/output \
--runner=SparkRunner --sparkMaster=yarn
Unfortunately, the job fails with the following exception:
Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
Here the full stack trace:
java.lang.IllegalStateException: Failed to serialize the pipeline options.
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.serializePipelineOptions(SparkRuntimeContext.java:58)
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.<init>(SparkRuntimeContext.java:41)
at org.apache.beam.runners.spark.translation.EvaluationContext.<init>(EvaluationContext.java:67)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:196)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:85)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
at at.tmobile.bigdata.examples.WordCount.main(WordCount.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:561)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unexpected IOException (of type java.io.IOException): Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
at com.fasterxml.jackson.databind.JsonMappingException.fromUnexpectedIOE(JsonMappingException.java:163)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2342)
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.serializePipelineOptions(SparkRuntimeContext.java:56)
... 12 more
Caused by: java.io.IOException: Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:710)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:629)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:618)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:128)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:2881)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2338)
... 13 more
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Conflicting property-based creators: already had [constructor for java.util.ArrayList, annotations: [null]], encountered [constructor for java.util.ArrayList, annotations: [null]]
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:266)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:241)
at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:394)
at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3169)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3062)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2175)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:708)
... 18 more
Caused by: java.lang.IllegalArgumentException: Conflicting property-based creators: already had [constructor for java.util.ArrayList, annotations: [null]], encountered [constructor for java.util.ArrayList, annotations: [null]]
at com.fasterxml.jackson.databind.deser.impl.CreatorCollector.verifyNonDup(CreatorCollector.java:228)
at com.fasterxml.jackson.databind.deser.impl.CreatorCollector.addPropertyCreator(CreatorCollector.java:168)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._handleSingleArgumentConstructor(BasicDeserializerFactory.java:487)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._addDeserializerConstructors(BasicDeserializerFactory.java:406)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._constructDefaultValueInstantiator(BasicDeserializerFactory.java:325)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findValueInstantiator(BasicDeserializerFactory.java:266)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.createCollectionDeserializer(BasicDeserializerFactory.java:851)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:390)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:348)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:261)
... 25 more
Does anybody know how to fix this?
Upvotes: 2
Views: 1448
Reputation: 410
I had the same problem.
The modules loaded in java.util.ServiceLoader.load(com.fasterxml.jackson.databind.Module.class)
are:
The problem is with dfsConfiguration
property of type ArrayList<Configuration>
.
Exclusion of paranamer
dependency in jackson-module-scala
dependency of spark runner
profile helps:
<profiles>
<profile>
<id>spark-runner</id>
<dependencies>
...
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
<version>2.8.8</version>
<scope>runtime</scope>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-paranamer</artifactId>
</exclusion>
</exclusions>
</dependency>
...
</dependencies>
</profile>
</profiles>
ParanamerModule checks property annotations and fails for ArrayList
constructors but it is optional.
Upvotes: 1