Reputation: 2714
I have been reading about NiFi and have few queries . Consider a use case where I want to move data into HDFS from local. I will use getFile and putHDFS processor.
So when I pass location to getFile , it will pick up data and will move into content repository and further it will pass to putHDFS processor for ingestion.
Question:
I have seen flow file content is a byte representation , does byte conversion is done by Nifi ?( If my source file is text file)?
How data is moved to HDFS from content repo ?
Upvotes: 0
Views: 637
Reputation: 18630
1) There is not really a conversion being done... the GetFile processor is reading bytes from the source file and writing bytes to the destination in the content repository. Whatever the content of the source file was, it will be the same in the content repository. This operation is performed in a streaming fashion so that a large file can be moved into the content repository without reading the whole file into memory.
2) The PutHDFS processor uses the Apache Hadoop 2.6.2 client to stream the bytes from the content repository into HDFS. It is similar to performing an "hdfs put" from the command line.
Upvotes: 1