Leif Wickland
Leif Wickland

Reputation: 3724

NullPointerException when attempting to serialize Avro GenericRecord containing array

I am trying to publish Avro (into Kafka) and get a NullPointerException when attempting to write the Avro object with the BinaryEncoder.

Here is the abbreviated stacktrace:

java.lang.NullPointerException: null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject
    at org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:132) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:126) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) ~[avro-1.8.1.jar:1.8.1]
    at com.mycode.KafkaAvroPublisher.send(KafkaAvroPublisher.java:61) ~[classes/:na]
    ....
    at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:112) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) ~[avro-1.8.1.jar:1.8.1]
    at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) ~[avro-1.8.1.jar:1.8.1]
    ... 55 common frames omitted

Here is the send method in my code where the exception occurs:

private static final EncoderFactory ENCODER_FACTORY = EncoderFactory.get();
private static final SpecificDatumWriter<ParentObject> PARENT_OBJECT_WRITER = new SpecificDatumWriter<>(ParentObject.SCHEMA$);
public void send(ParentObject parentObject) {
    try {
        ByteArrayOutputStream stream = new ByteArrayOutputStream();
        binaryEncoder = ENCODER_FACTORY.binaryEncoder(stream, binaryEncoder);
        PARENT_OBJECT_WRITER.write(parentObject, binaryEncoder);  // Exception HERE
        binaryEncoder.flush();
        producer.send(new ProducerRecord<>(topic, stream.toByteArray()));
    } catch (IOException ioe) {
        logger.debug("Problem publishing message to Kafka.", ioe);
    }
}

In the schema, the NestedObject contains an array of DeeplyNestedObject. I've done enough debugging to see that the NestedObject does, in fact, contain an array of DeeplyNestedObject or an empty array if none are present. Here is the relevant part of the schema:

[ { "namespace": "com.mycode.avro"
  , "type": "record"
  , "name": "NestedObject"
  , "fields":
    [ { "name": "timestamp", "type": "long", "doc": "Instant in time (milliseconds since epoch)." }
    , { "name": "objs", "type": { "type": "array", "items": "DeeplyNestedObject" }, "doc": "Elided." }
    ]
  }
]

Upvotes: 3

Views: 5355

Answers (2)

Michael S. Daines
Michael S. Daines

Reputation: 46

The stacktrace coming out of Avro is misleading. The problem is likely one level deeper than the class the Exception message indicates.

When it says "null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject", it means that one of the fields inside the DeeplyNestedObject is expected to be an array but is found to be null. (It completely makes sense to misinterpret that as meaning that the DeeplyNestedObject is null inside of NestedObject.)

You'll need to inspect the fields of DeeplyNestedObject and figure out which array is not being serialized correctly. The problem is likely to be located where the DeeplyNestedObject is created. It will have a field with type array which isn't being populated in all cases by the serializer before calling the send method.

Upvotes: 3

DontPanic
DontPanic

Reputation: 1367

I don't know enough about objects you have, but what I see in your example is that you have incorrect avro-schema.

DeeplyNestedObject in avro is a Record, so your schema must be like this:

{
  "type": "record",
  "name": "NestedObject",
  "namespace": "com.mycode.avro",
  "fields": [
    {
      "name": "timestamp",
      "type": "long"
    },
    {
      "name": "objs",
      "type": {
        "type": "record",
        "name": "DeeplyNestedObject",
        "fields": []
      }
    }
  ]
}

Of course all fields of DeeplyNestedObject you need to declare in "fields": [] related to the DeeplyNestedObject record.

Upvotes: 0

Related Questions