Reputation: 262
Other than the Confluent HDFS library (not open-source), is there any completely open-source library to move messages from Kafka (using Kafka Connect) to HDFS 3?
Upvotes: 1
Views: 593
Reputation: 137
The solution is to create a Kafka consumer with Python or any language you want to use, then this consumer with read data coming from Kafka topic as messages, and it will create a file for each message using Linux Hadoop commande (it can be done with Python hadoop package) , then write the data in HDFS files.
Upvotes: 0
Reputation: 191874
The HDFS2 connector is open source and free to use with the Community License if you're not using it to offer it as a hosted service, or as I said before, Apache Nifi is a more rich workflow product that works well in the Hadoop ecosystem alongside Kafka. Spark or Flink are often used for this too
Upvotes: 1