Reputation: 1101
I am fairly new to Avro and going through documentation for nested types. I have the example below working nicely but many different types within the model will have addresses. Is it possible to define an address.avsc file and reference that as a nested type? If that is possible, can you also take it a step further and have a list of Addresses for a Customer? Thanks in advance.
{"namespace": "com.company.model",
"type": "record",
"name": "Customer",
"fields": [
{"name": "firstname", "type": "string"},
{"name": "lastname", "type": "string"},
{"name": "email", "type": "string"},
{"name": "phone", "type": "string"},
{"name": "address", "type":
{"type": "record",
"name": "AddressRecord",
"fields": [
{"name": "streetaddress", "type": "string"},
{"name": "city", "type": "string"},
{"name": "state", "type": "string"},
{"name": "zip", "type": "string"}
]}
}
]
}
Upvotes: 16
Views: 29148
Reputation: 980
Other add to @Princey James
With the Example for 2. Declare all your types in a single avsc file.
It will work for Serializing and deserializing with code generation
but Serializing and deserializing without code generation is not working
you will get org.apache.avro.AvroRuntimeException: Not a record schema: [{"type":" ...
working example with code generation :
@Test
public void avroWithCode() throws IOException {
UserPerso UserPerso3 = UserPerso.newBuilder()
.setName("Charlie")
.setFavoriteColor("blue")
.setFavoriteNumber(null)
.build();
AddressRecord adress = AddressRecord.newBuilder()
.setStreetaddress("mo")
.setCity("Paris")
.setState("IDF")
.setZip("75")
.build();
ArrayList<AddressRecord> li = new ArrayList<>();
li.add(adress);
Customer cust = Customer.newBuilder()
.setUser(UserPerso3)
.setPhone("0101010101")
.setAddress(li)
.build();
String fileName = "cust.avro";
File a = new File(fileName);
DatumWriter<Customer> customerDatumWriter = new SpecificDatumWriter<>(Customer.class);
DataFileWriter<Customer> dataFileWriter = new DataFileWriter<>(customerDatumWriter);
dataFileWriter.create(cust.getSchema(), new File(fileName));
dataFileWriter.append(cust);
dataFileWriter.close();
DatumReader<Customer> custDatumReader = new SpecificDatumReader<>(Customer.class);
DataFileReader<Customer> dataFileReader = new DataFileReader<>(a, custDatumReader);
Customer cust2 = null;
while (dataFileReader.hasNext()) {
cust2 = dataFileReader.next(cust2);
System.out.println(cust2);
}
}
without :
@Test
public void avroWithoutCode() throws IOException {
Schema schemaUserPerso = new Schema.Parser().parse(new File("src/main/resources/avroTest/user.avsc"));
Schema schemaAdress = new Schema.Parser().parse(new File("src/main/resources/avroTest/user.avsc"));
Schema schemaCustomer = new Schema.Parser().parse(new File("src/main/resources/avroTest/user.avsc"));
System.out.println(schemaUserPerso);
GenericRecord UserPerso3 = new GenericData.Record(schemaUserPerso);
UserPerso3.put("name", "Charlie");
UserPerso3.put("favorite_color", "blue");
UserPerso3.put("favorite_number", null);
GenericRecord adress = new GenericData.Record(schemaAdress);
adress.put("streetaddress", "mo");
adress.put("city", "Paris");
adress.put("state", "IDF");
adress.put("zip", "75");
ArrayList<GenericRecord> li = new ArrayList<>();
li.add(adress);
GenericRecord cust = new GenericData.Record(schemaCustomer);
cust.put("user", UserPerso3);
cust.put("phone", "0101010101");
cust.put("address", li);
String fileName = "cust.avro";
File file = new File(fileName);
DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<>(schemaCustomer);
DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<>(datumWriter);
dataFileWriter.create(schemaCustomer, file);
dataFileWriter.append(cust);
dataFileWriter.close();
File a = new File(fileName);
DatumReader<GenericRecord> datumReader = new GenericDatumReader<>(schemaCustomer);
DataFileReader<GenericRecord> dataFileReader = new DataFileReader<>(a, datumReader);
GenericRecord cust2 = null;
while (dataFileReader.hasNext()) {
cust2 = dataFileReader.next(cust2);
System.out.println(cust2);
}
}
Upvotes: 2
Reputation: 61
Just to added to @Princey James answer, the nested type must be defined before it is used.
Upvotes: 3
Reputation: 738
There are 4 possible ways:
Example for 2. Declare all your types in a single avsc file. Also answers array declaration on address.
[
{
"type": "record",
"namespace": "com.company.model",
"name": "AddressRecord",
"fields": [
{
"name": "streetaddress",
"type": "string"
},
{
"name": "city",
"type": "string"
},
{
"name": "state",
"type": "string"
},
{
"name": "zip",
"type": "string"
}
]
},
{
"namespace": "com.company.model",
"type": "record",
"name": "Customer",
"fields": [
{
"name": "firstname",
"type": "string"
},
{
"name": "lastname",
"type": "string"
},
{
"name": "email",
"type": "string"
},
{
"name": "phone",
"type": "string"
},
{
"name": "address",
"type": {
"type": "array",
"items": "com.company.model.AddressRecord"
}
}
]
},
{
"namespace": "com.company.model",
"type": "record",
"name": "Customer2",
"fields": [
{
"name": "x",
"type": "string"
},
{
"name": "y",
"type": "string"
},
{
"name": "address",
"type": {
"type": "array",
"items": "com.company.model.AddressRecord"
}
}
]
}
]
Example for 3. Using a single static parser
Parser parser = new Parser(); // Make this static and reuse
parser.parse(<location of address.avsc file>);
parser.parse(<location of customer.avsc file>);
parser.parse(<location of customer2.avsc file>);
If we want a hold of the Schema, that is if we want to create new records, we can either do https://avro.apache.org/docs/1.5.4/api/java/org/apache/avro/Schema.Parser.html#getTypes() method to get the schema or
Parser parser = new Parser(); // Make this static and reuse
Schema addressSchema =parser.parse(<location of address.avsc file>);
Schema customerSchema=parser.parse(<location of customer.avsc file>);
Schema customer2Schema =parser.parse(<location of customer2.avsc file>);
Upvotes: 29