Rostyslav Barmakov
Rostyslav Barmakov

Reputation: 186

Deserialize Avro to Map

Does anybody know the way to deserialize Avro without using any Pojo and Schemas?

The problem: I have a data stream of different Avro files. The goal is to group that data depending on the presence of some attributes (e.g. user.role, another.really.deep.attribute.with.specific.value and so on). Each avro entry might contain any number of matching attributes - from zero to all listed).

So, there is no need to do anything with data. Just to peek at some elements.

The question is, is there any way to convert that data to Map or Node? Like I can do it with JSON using Jackson or GSON.

I've tried to use GenericDatumReader, but it requires a Schema. So maybe all I need is to read the schema from avro (how?).

Also, I've tried to use something like this, but this approach doesn't work.

public Map deserialize(byte[] data) {
    DatumReader<LinkedHashMap> reader
     = new SpecificDatumReader<>(LinkedHashMap.class);
    Decoder decoder = null;
    try {
        decoder = DecoderFactory.get().binaryDecoder(data, null);
        return reader.read(null, decoder);
    } catch (IOException e) {
        logger.error("Deserialization error:" + e.getMessage());
    }
}

Since I have time to 'play' with the problem, I have created a utility class that generates schemas depending on keys. It works, but looks like a big overhead.

Upvotes: 0

Views: 641

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191963

A reader schema is required to deserialize any message.

If you have the writer schema available, you can simply use that. Note that if you have Avro files, these include the schema they were written with and you can use avro-tools.jar -getschema to extract it

Without these options, then you'll need to figure out the schema on your own (maybe using a hexdump and knowing how Avro data gets encoded)

Upvotes: 1

Related Questions