Reputation: 6660
I am trying to convert a protobuf object to Avro. I am using
//myProto object is deserialized using google protobuf API
ProtobufDatumWriter<MyProto> pbWriter = new ProtobufDatumWriter<MyProto>(MyProto.class);
FileOutputStream fo = new FileOutputStream(args[0]);
Encoder e = EncoderFactory.get().binaryEncoder(fo, null);
pbWriter.write(myProto, e);
fo.flush();
The avro file was created successfully. If I cat the file, I can see the data in the file. However, when I tried to use avro-tools to get schema or meta info about the saved avro file, it says
Exception in thread "main" java.io.IOException: Not a data file.
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105)
at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
at org.apache.avro.tool.DataFileGetSchemaTool.run(DataFileGetSchemaTool.java:47)
Look at the Avro source code, the error means it does not have the first 4 bytes matching the MAGIC first 4 bytes. I am trying to see if I have done anything wrong.
Appreciate any help you can give me.
Upvotes: 5
Views: 9733
Reputation: 6660
I figure out why my codes was not working. Instead of using ProtobufDatumWriter to write to file directly, we should wrap it in the DataFileWriter, which is a container.
ProtobufDatumWriter<MyProto> pbWriter = new ProtobufDatumWriter<MyProto>(MyProto.class);
DataFileWriter<MyProto> dataFileWriter = new DataFileWriter<MyProto>(pbWriter);
Schema schema= ProtobufData.get().getSchema(MyProto.class);
dataFileWriter.create(schema, new File("test.avro"));
dataFileWriter.append(myProto);
dataFileWriter.close();
Upvotes: 6