Reputation: 123
I have data in a file on my local windows machine. The local machine has Apache NiFi running on it. I want to send this file to HDFS over the network using NiFi. How could I configure putHDFS processor in NiFi on the local machine such that I could send data to HDFS over the network?
Thank you!
Upvotes: 3
Views: 14157
Reputation: 33
Just add Hadoop core configuration file directory to the first field
$HADOOP_HOME/conf/hadoop/hdfs-site.xml, $HADOOP_HOME/conf/hadoop/core-site.xml
and set the hdfs directory of the data ingestion to get stored in the field of Directory & let everything else default.
Upvotes: 0
Reputation: 1633
Using the GetFile processor or the combination of ListFile/FetchFile, it would be possible to bring this file from your local disk into NiFi and pass this onto the PutHDFS processor. The PutHDFS processor relies on the associated core-site.xml and hdfs-site.xml files in its configuration.
Upvotes: 0
Reputation: 18630
You need to copy the core-site.xml and hdfs-site.xml from one of your hadoop nodes to the machine where NiFi is running. Then configure PutHDFS so that the configuration resources are "/path/to/core-site.xml,/path/to/hdfs-site.xml". That is all that is required from the NiFi perspective, those files contain all of the information it needs to connect to the Hadoop cluster.
You'll also need to ensure that the machine where NiFi is running has network access to all of the machines in your Hadoop cluster. You can look through those config files and find any hostnames and IP addresses and make sure they can be accessed from the machine where NiFi is running.
Upvotes: 13