Anuja Barve
Anuja Barve

Reputation: 320

Converting Avro Binary String to Json

I have a String in Avro Binary Format. I want to convert the String to json. Can someone please guide me ?. I tried using the solutions available online It did not work.

public String avroToJson(byte[] avro) throws IOException {
        boolean pretty = false;
        GenericDatumReader<GenericRecord> reader = null;
        JsonEncoder encoder = null;
        ByteArrayOutputStream output = null;
        try {
            reader = new GenericDatumReader<GenericRecord>();
            InputStream input = new ByteArrayInputStream(avro);
            DataFileStream<GenericRecord> streamReader = new DataFileStream<GenericRecord>(input, reader);
            output = new ByteArrayOutputStream();
            Schema schema = streamReader.getSchema();
            DatumWriter<GenericRecord> writer = new GenericDatumWriter<GenericRecord>(schema);
            encoder = EncoderFactory.get().jsonEncoder(schema, output, pretty);
            for (GenericRecord datum : streamReader) {
                writer.write(datum, encoder);
            }
            encoder.flush();
            output.flush();
            return new String(output.toByteArray());
        } finally {
            try {
                if (output != null) output.close();
            } catch (Exception e) {
            }
        }
    }

I am converting the my String to byte array using getBytes() and passing it to this function. I am getting this exception. Exception in thread "main" org.apache.avro.InvalidAvroMagicException: Not an Avro data file.

Upvotes: 2

Views: 19746

Answers (1)

Ryan Skraba
Ryan Skraba

Reputation: 1158

Avro specifies a binary format for serializing one object, but also a Object Container File (also known as data file) which can hold many objects in a useful way for file access.

DataFileStream expects the container file, but from your description, it looks like you have a single serialized instance.

You probably want something like:

  public String avroToJson(Schema schema, byte[] avroBinary) throws IOException {
    // byte to datum
    DatumReader<Object> datumReader = new GenericDatumReader<>(schema);
    Decoder decoder = DecoderFactory.get().binaryDecoder(avroBinary, null);
    Object avroDatum = datumReader.read(null, decoder);

    // datum to json
    String json = null;
    try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
      DatumWriter<Object> writer = new GenericDatumWriter<>(schema);
      JsonEncoder encoder = EncoderFactory.get().jsonEncoder(schema, baos, false);
      writer.write(avroDatum, encoder);
      encoder.flush();
      baos.flush();
      return new String(baos.toByteArray(), StandardCharsets.UTF_8);
    }
  }

Note that this means you must know the schema in advance to deserialize the binary data. If it were an Avro data file, you could get the schema from the file metadata.

Upvotes: 6

Related Questions