NIFI (S3 to HDFS)

Question

I am trying to copy data from S3 to HDFS using ListS3, fetchS3object and PutHDFS. Data in S3 bucket is structured as follows, need to copy it to HDFS with same folder structure(folder names are dynamic).

bucketname/parent-folder1/subfolder1/filename1.txt

bucketname/parent-folder1/subfolder2/filename2.txt

bucketname/parent-folder2/subfolder1/filename3.txt

PutHDFS processor is showing following error

org.apache.nifi.processor.exception.ProcessException: Copied file to HDFS but could not rename dot file /dev/.parent-folder1/subfolder1/filename1.txt to its final filename

I understand that folders are virtual in S3. It works by introducing UpdateAttribute processor (${filename:replaceAll("/", "-")}) but the folder structred is not created in HDFS. What are other options ? Is there any template?

Some doubts on error handling
1)ListS3 processor maintains state. What happens when ListS3 and fetchS3object are successful and PutHDFS fails ? will ListS3 load file again or it is up to the developer to handle exception. Is is possible to reuse flowfile loaded by fetchS3object.
2)How does an end user know successful copy and failed copy

Thanks Tilak

NIFI (S3 to HDFS)

Answers (1)

Related Questions