Reputation: 1111
I'm trying to get my head around on extracting attributes from Avro and JSON. I'm able to extract attributes from JSON by using EvaluateJsonPath
processor. I'm trying to do the same on Avro, but i'm not sure whether it is achievable.
Here is my flow, ExecuteSQL
-> SplitAvro
-> UpdateAttribute
UpdateAttribute
is the processor where i want to extract the attributes. Please find below snapshot of UpdateAttribute
processor,
So, my basic question is, could we extract attributes form Avro? If yes, please provide me the right approach. Or is it necessary to use ConvertAvroToJSON
always before extracting the attributes?
Upvotes: 7
Views: 8663
Reputation: 1944
Maybe use PartitionRecord
instead of SplitAvro
and UpdateAttribute
processors - it will partition your records based on the attributes you provide, hence no need for explicit splitting, or you can do splitting later in the flow.
E.g., for the setup in OP's question (ExecuteSQL
-> SplitAvro
-> UpdateAttribute
):
And you can configure PartitionRecord
like below, with a corresponding RecordPath for each attribute:
Upvotes: 2
Reputation: 12083
Currently, there is no way in NiFi to extract attributes directly from Avro (there is not yet an AvroPath like XPath for XML or JsonPath for JSON) so as you said you can use ConvertAvroToJSON before extracting the attributes.
Alternatively, I wrote a Groovy script for use in an ExecuteScript processor, it takes "Avro path" values as dynamic properties (each starting with avro.path and whose value is really JsonPath), does the conversion of Avro to JSON in memory, and requires you download and point to the Avro JARs. I can post it here if you are interested, but really its only advantage is to maintain the flow file content in Avro, and although it might be annoying, you could use ConvertAvroToJson -> EvaluateJsonPath -> ConvertJsonToAvro as the workaround.
Upvotes: 14