m1nkeh
m1nkeh

Reputation: 1397

Avro Serialise to Azure EventHub

I can't find much doco on this topic, but i'm serialising an object to Avro and then sending to Azure EventHub.. I think the Avro needs to contain schema too.. because without, how will a consumer (I.e. Azure Stream Analytics for example) know how to deserialise?

The only example i can locate online uses Microsoft.Hadoop.Avro.Container namespace.. this seems to work fine, i can read via Stream Analytics.. but does this code 'automagically' include the schema in the payload? I sure as can't see any reference to it here:

        using (var memoryStream = new MemoryStream())
        using (var writer = AvroContainer.CreateWriter<T>(memoryStream, Codec.Null))
        using (var seqWriter = new SequentialWriter<T>(writer, items.Count()))
        {
            foreach (var e in items)
            {
                seqWriter.Write(e);
            }

            return memoryStream.ToArray();
        }

The landscape of Avro in .Net seems a bit confused, why is there a Microsoft specific NuGet pkg? It seems quite old, has it now been superseded by something? Is there any documentation on how to leverage the standard Apache.Avro NuGet pkg to build a payload that contains the schema?

Azure Event Hub documentation fleetingly mentioned Avro, but any google search only really turns out Event Hub Capture..

Anyway in short.. is there a better way? I don't think i can send the schemas separately for this..

Upvotes: 2

Views: 2722

Answers (1)

Peter Pan
Peter Pan

Reputation: 24148

First, Azure Stream Analytics support processing events in Avro data formats, you can see it in the offical document Parse JSON and Avro data in Azure Stream Analytics, as the figure below.

enter image description here

Even assumed that Azure Stream Analytics can not deserialize an event of Avro format as you wish, you also can write a custom .NET deserializer to make it work for you, as the offical documents below said.

  1. Tutorial: Custom .NET deserializers for Azure Stream Analytics
  2. Use .NET deserializers for Azure Stream Analytics jobs

Meanwhile, I don't think Microsoft.Hadoop.Avro2 is a suitable library for Avro in your scenario. Except it, there are other choices

  1. Apache.Avro Its API reference page is https://avro.apache.org/docs/current/api/csharp/html/namespaces.html, and you need to refer to the sample codes for Java or Python to write your C# code.
  2. Microsoft.Avro.Core and its GitHub rep dougmsft/microsoft-avro with some test code which can be refered to.

Upvotes: 4

Related Questions