NightWolf
NightWolf

Reputation: 7784

Avro record with Any type field in Scala

Say I have a simple key value pair in Avro where the value could be a float, double, int, string etc;

{"namespace": "com.namespace.kafka.event",
 "type": "record",
 "name": "RecordName",
 "fields": [
    {"name": "key", "type": "String"},
    {"name": "value", "type": "Any/Object/Bytes???"}
 ]
}

What is the best way to represent this in Avro?

  1. Have an array of bytes that is somehow deserialised in Scala and infer the type or add another value field with metadata
  2. Create a custom record type for each primitive type that goes in value and use the generic record parsing in Avro
  3. Create a key/value pair for each primitive value type we wish to represent.

The other problem is how would we represent this in Scala. Having an Any type is a pain, its much nicer to know the type, if its numeric etc rather than having to do type tests everywhere...

Upvotes: 3

Views: 3819

Answers (2)

sksamuel
sksamuel

Reputation: 16387

If you are using avro4s then you can use an Either[A,B] if you only have two types. Define your case class to include the either, like:

case class Moo(either: Either[String, BigDecimal])

Then you can create a schema for it:

val schema = Schemafor[Moo]

Or write out data:

val moo1 = Moo(Left("moo1"))
val moo2 = Moo(Right(12.3))

val output = new ByteArrayOutputStream
val avro = AvroOutputStream.data[Moo](output)
avro.write(moo1, moo2)
avro.close()

And read in data:

val in = AvroInputStream.data[Moo](bytes)
val moos = in.iterator.toList
in.close()

If you have more than two types you could use Coproduct from Shapeless. The case class now looks like this:

case class Moo(coproduct: String :+: BigDecimal :+: CNil)

If you are not familar with coproduct syntax from shapeless, then it is a bit unusual when you first see it, but you're just combing types together using infix style, and the +:+ is actually the name of a type like :: is the name of the non empty List in standard scala.

Now you create instances like this:

val moo1 = Moo(Coproduct[String]("moo1"))
val moo2 = Moo(Coproduct[BigDecimal](12.3))

And the rest is the same.

See unit tests in avro4s here for further examples.

Upvotes: 2

Knows Not Much
Knows Not Much

Reputation: 31526

can you try using the Union DataTypes of Avro?

https://avro.apache.org/docs/1.8.1/spec.html#Unions

Upvotes: 1

Related Questions