Reputation: 243
We are using avro1.8.2 to serialize data with optional date type field to be published to topic.
record aRecord {
/** Variable: lastUpdate
* lastUpdate indicates the latest date and time the reference asset was updated
*/
union {null, date} lastUpdate = null;
/** Variable: businessDate
* businessDate indicates the business date of the reference asset price
*/
union {null, date} businessDate = null;
}
Ran into the following exception while using the avro tool generated java class to serialize the data:
Error serializing avro message
Caused by: org.apache.avro.AvroRunTimeException: Unknown datum type org.joda.time.LocalDate: 2021-09-17
at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:772)
at org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:302)
at org.apache.avro.generic.GenericData.resolveUnion
Please note that2 this happens regardless of the value is null or non-null (as shown value 2021-09-17 also caused the exception) We did the following investigation and experiment but could not figure it out why:
Making the date field mandatory, the issue is resolved. This is because DATE_CONVERSION is added to the corresponding field in the java class generated by avro tool. If this field is defined as optional and default value is null, DATA_CONVERSION is not added to the java file generated by avro tool.
Using avro 1.9.1 resolved the issue unfortunately we must use avro 1.8.2
We also tried a few other versions of kafka-avro-serializer and spring-boot kafka framework. Nothing works for us.
Other projects that depend on avro1.8.2 seems to be able to handle this and we checked all the places as far as we considered relevant and all the codes are the same except that somehow they have DATE_CONVERSION in place in the java file generated by avro tool (although they are defined in advl file exactly the same).
Debuggin into the GenericData.java we found that if DATE_CONVERSION is in place for optional date field, getSchemaName is not called at all. The getSchemaName basically checks of the type of the object, whether it's an Int, Record, String,...etc.
The date is a logicaltype of joda. Its real type is int as far as we understand
So our questions are:
How to make avro tool enable DATE_CONVERSION for optional "date" type field using avro 1.8.2?
If DATE_CONVERSION is not the key to resolve the issue, what's the best practice to serialize date type field using avro 1.8.2? and this field could be null (default) or non-null.
Thanks.
Upvotes: 0
Views: 881
Reputation: 243
SpecificData specificData = SpecificData.get();
specificData.addLogicalTypeConversion(new DateConversion());
DatumWriter<MessageClass> dw = new SpecificDatumWriter<MessageClass>(message.getSchema(), specificData);
DataFileWriter<MessageClass> dfw = new DataFileWriter<MessageClass>(dw);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
dfw.create(message.getSchema(), outputStream);
dfw.append(message);
dfw.close();
ProducerRecord<String, byte[]> record = new ProducerRecord<>(topic, key, outputStream.toByteArray());
return kafkaProducer.send(record, new Callback());
The above code fixed the issue. MessageClass is the java code generated by avro tool.
message is wrapped in specificData which is constructed with new DateConversion() DATE_CONVERSION is exactly what is needed for optional date field during serialization.
Note that this solution is only needed as a workaround to avro1.8.
Upvotes: 0