Reputation: 604
What I need: the right combination of Dependency-Versions to be able to Read and Write from/to Datastore in Dataflow (v. 1.9.0) via DatastoreIO.v1().read/write and which dependencies need to be referenced in pom?
Dataflow specific dependencies referenced in pom from mavenrepo for Dataflow 1.9.0:
com.google.cloud.dataflow/google-cloud-dataflow-java-sdk-all/1.9.0
com.google.cloud.datastore/datastore-v1-protos/1.0.1
com.google.cloud.datastore/datastore-v1-proto-client/1.1.0
com.google.protobuf/protobuf-java/3.0.0-beta-1
when writing to Datastore (actually when building the Entities) I get the following exception:
// CamelExecutionException (Setup running with Camel-Routes, but for development purposes not in Fuse but as a local CamelRoute in Eclipse)
Caused by: java.lang.NoClassDefFoundError: com/google/protobuf/GeneratedMessageV3
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at com.google.datastore.v1.Value.toBuilder(Value.java:749)
at com.google.datastore.v1.Value.newBuilder(Value.java:743)
at xmlsource.dataflow.test.EntityUtil.getStringValue(EntityUtil.java:404)
at xmlsource.dataflow.test.EntityUtil.getArticleEntity(EntityUtil.java:152)
at xmlsource.dataflow.test.parser.ArticleToEntity.processElement(ArticleToEntity.java:21)
at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:139)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1229)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:1098)
at com.google.cloud.dataflow.sdk.transforms.ParDo.access$300(ParDo.java:457)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1084)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1079)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:858)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:814)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:526)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:96)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
at xmlsource.dataflow.test.PipelineParseTest.createAndRun(PipelineParseTest.java:208)
at xmlsource.dataflow.test.PipelineTester.process(PipelineTester.java:11)
at org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
... 8 more
The referenced line in xmlsource.dataflow.test.EntityUtil.getStringValue(EntityUtil.java:404):
Value.newBuilder().setStringValue(value).build();
And when reading more or less the same:
java.lang.NoClassDefFoundError: com/google/protobuf/GeneratedMessageV3
…
When changing the dependencies to (only not the beta-version for protobuf-java)
com.google.cloud.datastore/datastore-v1-protos/1.0.1
com.google.cloud.datastore/datastore-v1-proto-client/1.1.0
com.google.protobuf/protobuf-java/3.0.0
and trying to write, following exception occurs:
// CamelExecutionException...
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
com/google/datastore/v1/Value$Builder.mergeGeoPointValue(Lcom/google/type/LatLng;)Lcom/google/datastore/v1/Value$Builder; @76: invokevirtual
Reason:
Type 'com/google/type/LatLng' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
Current Frame:
bci: @76
flags: { }
locals: { 'com/google/datastore/v1/Value$Builder', 'com/google/type/LatLng' }
stack: { 'com/google/protobuf/SingleFieldBuilder', 'com/google/type/LatLng' }
Bytecode:
someBytecode
Stackmap Table:
same_frame(@50)
same_frame(@55)
same_frame(@62)
same_frame(@80)
same_frame(@89)
at com.google.datastore.v1.Value.toBuilder(Value.java:749)
at com.google.datastore.v1.Value.newBuilder(Value.java:743)
at xmlsource.dataflow.test.EntityUtil.getStringValue(EntityUtil.java:404)
at xmlsource.dataflow.test.EntityUtil.getArticleEntity(EntityUtil.java:152)
at xmlsource.dataflow.test.parser.ArticleToEntity.processElement(ArticleToEntity.java:21)
at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:139)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1229)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:1098)
at com.google.cloud.dataflow.sdk.transforms.ParDo.access$300(ParDo.java:457)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1084)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1079)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:858)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:814)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:526)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:96)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
at xmlsource.dataflow.test.PipelineParseTest.createAndRun(PipelineParseTest.java:208)
at xmlsource.dataflow.test.PipelineTester.process(PipelineTester.java:11)
at org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
Here the exception references a function mergeGeoPointValue, while my code never calls any function to set LatLng or GeoPoint Values. The referenced line in my code again just sets the String-Value
When reading I have the same Exception, again when transforming the POJO to a datastore entity
Value.newBuilder().setStringValue("someString").build()
The whole Query:
Query query = Query.newBuilder()
.addKind(KindExpression.newBuilder()
.setName("test_article").build())
.setFilter(Filter.newBuilder()
.setPropertyFilter(PropertyFilter.newBuilder()
.setProperty(PropertyReference.newBuilder()
.setName("somePropertyName"))
.setOp(PropertyFilter.Operator.EQUAL)
.setValue(Value.newBuilder()
.setStringValue("someString").build())
.build())
.build())
.build();
Changing the dependencies to (datastore-v1-protos/1.3.0):
com.google.cloud.datastore/datastore-v1-protos/1.3.0
com.google.cloud.datastore/datastore-v1-proto-client/1.1.0
com.google.protobuf/protobuf-java/3.0.0 (or 3.2.0)
With this setup I can successfully write to Datastore via .apply(DatastoreIO.v1().write().withProjectId("someProjectId"));
When trying to read, the Query-Object is built successfully, but...:
// CamelExecutionException
Caused by: java.lang.NoSuchMethodError: com.google.datastore.v1.Query$Builder.clone()Lcom/google/protobuf/GeneratedMessage$Builder;
at com.google.cloud.dataflow.sdk.io.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:648)
at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:139)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1229)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:1098)
at com.google.cloud.dataflow.sdk.transforms.ParDo.access$300(ParDo.java:457)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1084)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1079)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:858)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:814)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:526)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:96)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
at xmlsource.dataflow.test.PipelineParseTest.createAndRun(PipelineParseTest.java:208)
at xmlsource.dataflow.test.PipelineTester.process(PipelineTester.java:11)
at org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
... 8 more
The line where I try to read from Datastore:
PCollection<Entity> entityCollection = p.apply(
DatastoreIO.v1().read().withNamespace("test_ns_df")
.withProjectId("someProjectId")
.withQuery(query));
EDIT: When using the dependencies (and parent-pom) from GitHubDataflowExample I again get the java.lang.NoClassDefFoundError: com/google/protobuf/GeneratedMessageV3 when building the Value for the Query....
So I never made the read to work... Did anyone experience similar problems and found out how to solve this? Or do I need to build the values differently? The same exceptions occur when using DatastoreHelper.makeValue... The referenced Dependencies in a working project would also help a lot!
I thought this would be a dependency/version problem, but maybe someone of you knows better. It can't be, that I am the first one having those problems with java.lang.NoSuchMethodError: com.google.datastore.v1.Query$Builder.clone()
like this guy NoSuchMethodError in DatastoreWordCount example who just pulled a wrong version, but on my end, this doesn't result in success.
Thanks in advance
Upvotes: 1
Views: 452
Reputation: 604
Found the problem:
Due to the fact that there is a preprocess in the same project running with Camel-Fuse storing files in Google Storage, I had a dependency on google-storage:
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-storage</artifactId>
<version>0.6.0</version>
</dependency>
This dependency was mentioned in the pom.xml BEFORE the dataflow-dependency. After switching the order of dependencies (dataflow before datastore) and removing all other dependencies, the DatastoreIO works perfectly! Then, depending on your operations (for example an XMLSource) some runtime-dependencies need to be added
Upvotes: 1