Saulo Ricci
Saulo Ricci

Reputation: 774

NullPointerException caught when writing to BigTable using Apache Beam's dataflow sdk

I'm using Apache's Beam sdk version 0.2.0-incubating-SNAPSHOT and trying to pull data to a bigtable with the Dataflow runner. Unfortunately I'm getting NullPointerException when executing my dataflow pipeline where I'm using BigTableIO.Write as my sink. Already checked my BigtableOptions and the parameters are fine, according to my needs.

Basically, I create and in some point of my pipeline I have the step to write the PCollection<KV<ByteString, Iterable<Mutation>>> to my desired bigtable:

final BigtableOptions.Builder optionsBuilder =
    new BigtableOptions.Builder().setProjectId(System.getProperty("PROJECT_ID"))
        .setInstanceId(System.getProperty("BT_INSTANCE_ID"));

// do intermediary steps and create PCollection<KV<ByteString, Iterable<Mutation>>> 
// to write to bigtable

// modifiedHits is a PCollection<KV<ByteString, Iterable<Mutation>>>
modifiedHits.apply("writting to big table", BigtableIO.write()
    .withBigtableOptions(optionsBuilder).withTableId(System.getProperty("BT_TABLENAME")));

p.run();

When executing the pipeline, I got the NullPointerException, pointing out exactly to the BigtableIO class at the public void processElement(ProcessContext c) method:

(6e0ccd8407eed08b): java.lang.NullPointerException at org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Write$BigtableWriterFn.processElement(BigtableIO.java:532)

I checked this method is processing all elements before to write on bigtable, but not sure why I'm getting such exception overtime I execute this pipeline. According to the code below, this method uses bigtableWriter attribute to process each c.element(), but I can't even set a breakpoint to debug where is exactly the null. Any kind of advice or suggestion of how to solve this issue?

@ProcessElement
  public void processElement(ProcessContext c) throws Exception {
    checkForFailures();
    Futures.addCallback(
        bigtableWriter.writeRecord(c.element()), new WriteExceptionCallback(c.element()));
    ++recordsWritten;
  }

Thanks.

Upvotes: 0

Views: 551

Answers (1)

jkff
jkff

Reputation: 17913

I looked up the job and its classpath, and if I'm not mistaken it looks like you're using version 0.3.0-incubating-SNAPSHOT of beam-sdks-java-{core,io}, but version 0.2.0-incubating-SNAPSHOT of google-cloud-dataflow-java.

I believe the issue is because of this - you have to use the same version (more details: BigtableIO in version 0.3.0 uses @Setup and @Teardown methods, but runner 0.2.0 does not support them yet).

Upvotes: 2

Related Questions