Reputation: 1759
We have a scenario where Kafka Producer should read a list of incoming files and produce them to Kafka Topics. I've read about FileSourceConnector (http://docs.confluent.io/3.1.0/connect/connect-filestream/filestream_connector.html) but it reads only one file and sends new lines added to that file. File rotation is not handled. A few questions: 1) Is it better to implement our own Producer code to meet our requirement or can we extend the File Connector class so that it reads new files and sends them to Kafka topics. 2) Is there any other source connector that can be used in this scenario?
In terms of performance and ease of development, which approach is better? i.e., developing our Producer code to read files and send to Kafka or extending the Connector code and making changes to it.
Any kind of feedback will be greatly appreciated! Thank you!
Upvotes: 0
Views: 1283
Reputation: 1129
I personally used the Producer API directly. I handled file rotation and could publish in realtime. There was a tricky part into making sure the files were exactly the same on the source and sink systems (exactly-once processing).
Upvotes: 1
Reputation: 4313
You could write a producer as you suggested - or better yet, write your own connector using the developer API
Upvotes: 0
Reputation: 451
Have you take a look to Akka Streams - Reactive Kafka? https://github.com/akka/reactive-kafka
Check this example: https://github.com/ktoso/akka-streams-alpakka-talk-demos-2016/blob/master/src/main/java/javaone/step1_file_to_kafka/Step1KafkaLogStreamer.java
Upvotes: 0