DJ180
DJ180

Reputation: 19854

Avro: ReflectDatumWriter does not output schema information

See the following sample code:

        User datum = new User("a123456", "[email protected]");
        Schema schema = ReflectData.get().getSchema(datum.getClass());
        DatumWriter<Object> writer = new ReflectDatumWriter<>(schema);
        ByteArrayOutputStream output = new ByteArrayOutputStream();
        Encoder encoder = EncoderFactory.get().binaryEncoder(output, null);
        writer.write(datum, encoder);
        encoder.flush();
        byte[] bytes = output.toByteArray();
        System.out.println(new String(bytes));

which produces:

[email protected]

I had presumed that all Avro writers would publish the schema information as well as the data, but this does not.

I can successfully get the schema printed if I use the GenericDatumWriter in combination with a DataFileWriter but I wish to use the ReflectDatumWriter as I don't wish to construct a GenericRecord myself (I want the library to do this)

How do I get the schema serialized as well?

Upvotes: 2

Views: 551

Answers (1)

DJ180
DJ180

Reputation: 19854

I solved this myself, you need to use a DataFileWriter as this contains an entry in the create() method that writes the schema

Solution is to use this in conjunction with a ByteArrayOutputStream:

        Schema schema = ReflectData.get().getSchema(User.class);
        DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<GenericRecord>(schema);
        DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<GenericRecord>(datumWriter);
        ByteArrayOutputStream output = new ByteArrayOutputStream();
        dataFileWriter.create(schema, output);
        GenericRecord user = createGenericRecord(schema);
        dataFileWriter.append(user);
        dataFileWriter.close();
        byte[] bytes = output.toByteArray();
        System.out.println(new String(bytes));

Upvotes: 1

Related Questions