Ankur Kumar
Ankur Kumar

Reputation: 56

Need to use 1 Processor instead of 5 FetchHDFS in NiFi

I have 5 XML files in HDFS which I am fetching using Apache this is the flow nifi. First, I am using Generate Flow file processor and then I have to use 5 different FetchHdfs processors. I can't use GetHdfs because it deletes all the file from directory and I don't have permission to ingest the files back. Hence, I am searching for a way that instead of using 5 FetchHdfs, what else can I do?. All the files are in the same directory and I want to keep them so that I can test multiple times. I am ingesting those files in TransformXML processor and converting them to JSON

Upvotes: 1

Views: 329

Answers (2)

Ankur Kumar
Ankur Kumar

Reputation: 56

Thanks everyone for answering. I am unable to vote anyone's answer and hence I am writing what I did.

First I used the ListHDFS processor and it will list out all the filenames. Then I used FetchHDFS and in HDFS filename, I put '${path}/${filename}'.

change the ${path} to your path of the directory and leave the ${filename} as is as this is a property of ListHDFS and that's where it is picking the filenames from. This way, there is no need of loops or anything and as soon as the new file is uploaded in the directory, it will be picked by the ListHDFS processors. So, leave the entire processes working.

Upvotes: 0

Mike R
Mike R

Reputation: 588

Instead of the GetHDFS Processor, try the ListHDFS Processor as it lists the entire directory and doesn't delete the files ListHDFS It says in the description, "Unlike GetHDFS, this Processor does not delete any data from HDFS."

Upvotes: 0

Related Questions