Reputation: 774
I'm using Apache's Beam
sdk version 0.2.0-incubating-SNAPSHOT
and trying to pull data to a bigtable with the Dataflow
runner. Unfortunately I'm getting NullPointerException
when executing my dataflow pipeline where I'm using BigTableIO.Write
as my sink. Already checked my BigtableOptions
and the parameters are fine, according to my needs.
Basically, I create and in some point of my pipeline I have the step to write the PCollection<KV<ByteString, Iterable<Mutation>>>
to my desired bigtable:
final BigtableOptions.Builder optionsBuilder =
new BigtableOptions.Builder().setProjectId(System.getProperty("PROJECT_ID"))
.setInstanceId(System.getProperty("BT_INSTANCE_ID"));
// do intermediary steps and create PCollection<KV<ByteString, Iterable<Mutation>>>
// to write to bigtable
// modifiedHits is a PCollection<KV<ByteString, Iterable<Mutation>>>
modifiedHits.apply("writting to big table", BigtableIO.write()
.withBigtableOptions(optionsBuilder).withTableId(System.getProperty("BT_TABLENAME")));
p.run();
When executing the pipeline, I got the NullPointerException
, pointing out exactly to the BigtableIO class at the public void processElement(ProcessContext c)
method:
(6e0ccd8407eed08b): java.lang.NullPointerException at org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Write$BigtableWriterFn.processElement(BigtableIO.java:532)
I checked this method is processing all elements before to write on bigtable, but not sure why I'm getting such exception overtime I execute this pipeline. According to the code below, this method uses bigtableWriter
attribute to process each c.element()
, but I can't even set a breakpoint to debug where is exactly the null
. Any kind of advice or suggestion of how to solve this issue?
@ProcessElement
public void processElement(ProcessContext c) throws Exception {
checkForFailures();
Futures.addCallback(
bigtableWriter.writeRecord(c.element()), new WriteExceptionCallback(c.element()));
++recordsWritten;
}
Thanks.
Upvotes: 0
Views: 551
Reputation: 17913
I looked up the job and its classpath, and if I'm not mistaken it looks like you're using version 0.3.0-incubating-SNAPSHOT
of beam-sdks-java-{core,io}
, but version 0.2.0-incubating-SNAPSHOT
of google-cloud-dataflow-java
.
I believe the issue is because of this - you have to use the same version (more details: BigtableIO in version 0.3.0 uses @Setup
and @Teardown
methods, but runner 0.2.0 does not support them yet).
Upvotes: 2