Reputation:
I want to write nifi processor which can read xml file from hdfs directory and then extracting it's data into flowfile attributes, also if there is case when two nifi processor can get this file and read data or write something into it how can i do file lock so that at a time only one processor can use it? Can you reccomend me any article, code examples or some related materials which can help me. i'haven't write any custom processor yet.
Upvotes: 1
Views: 1094
Reputation: 14184
I'm not sure why you need to write a custom processor in this case, because both GetHDFS
and EvaluateXPath
processors exist and should be able to perform the necessary tasks here.
Be careful when extracting flowfile content into attributes, as flowfile content is stored in the content repository and only a reference pointer is passed around as the flowfile moves through the flow. Attributes, however, are stored inline in the flowfile repository, and occupy heap space for rapid retrieval. It is easy to ingest a large piece of source data and accidentally put the whole block of data into the heap if you are not careful. See Apache NiFi In Depth for more details.
If you are still interested in performing custom processor development, this article by Bryan Bende is a good starting point. The versions referenced are stale, but the process described should hold up quite well. The Apache NiFi Developer Guide is another. Finally, the Apache NiFi Contributor Guide has checkstyle instructions, tips for configuring your development environment, etc.
Upvotes: 2
Reputation: 5271
There are two questions here :
1 - How to extract XML into Flowfile attributes ?
Options:
SplitXML -> EvaluateXPath (destination flowfile attribute) ->ReplaceText (to use the attributes)
TransformXML -> SplitJSON-> evaluateJsonPath (destination flowfile attribute) ->ReplaceText (to use the attributes)
2 - How to make sure the file is only processed once ? The getfile/gethdfs proc has the option "Keep Source File" , is move/remove/keep option on processing. you can have it moved to a staging area and and moved back after processed
Upvotes: 1