Reputation: 133
Trying to write data from kafka to hdfs. It is not documented anywhere how to use Confluent's kafka-connect-hdfs Java API.
Upvotes: 3
Views: 1564
Reputation: 5158
You don't need to use the Java API for. KafkaConnect can be used from the command line or REST API... even if you are triggering connectors from Java, the REST API will still work.
Some documentation to get you started:
First, the KafkaConnect quickstart, just to make sure your system is in good shape before trying to do something advanced: http://docs.confluent.io/3.0.0/connect/intro.html#quickstart
If you are new to Kafka, maybe even start earlier with the Kafka quickstart: https://kafka.apache.org/quickstart
Once stand-alone more works, try switching to distributed mode and check out the REST API
Then try the HDFS connector. Either start with the quick-start: http://docs.confluent.io/3.0.0/connect/connect-hdfs/docs/hdfs_connector.html#quickstart
Or the blog tutorial: http://www.confluent.io/blog/how-to-build-a-scalable-etl-pipeline-with-kafka-connect
Hope this helps on your way.
Upvotes: 6
Reputation: 599
You can use Kafka's Producer Java API to write to Kafka topics.
kafka-connect-hdfs will take messages from topics and put them in HDFS. This does not require Java code.
You run it has shown in the kafka-connect-hdfs quickstart:
$ ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties \
etc/kafka-connect-hdfs/quickstart-hdfs.properties
As of the moment, kafka-connect-hdfs only supports topics with Avro data formats registered with the Kafka Schema Registry.
Upvotes: 0