Soheil Pourbafrani
Soheil Pourbafrani

Reputation: 3427

Writing Kafka Streaming results on HDFS

I wrote a Kafka Streaming application that writes results on local files using the code:

source.mapValues(record -> finall(record)).mapValues(record -> Arrays.deepToString(record))
            .writeAsText(PATH);

Trying to save data on HDFS, using the command:

source.mapValues(record -> finall(record)).mapValues(record -> Arrays.deepToString(record))
            .writeAsText(hdfs://localhost:54310/output);

it errors:

Unable to write stream to file at [hdfs://localhost:54310/output] hdfs:/localhost:54310/output (No such file or directory)

Is there any way to write Kafka Streaming results on HDFS?

Upvotes: 1

Views: 3298

Answers (1)

Robin Moffatt
Robin Moffatt

Reputation: 32090

I would avoid this pattern, and instead write from KStreams back to a Kafka topic, and simply stream that topic to HDFS using the Kafka Connect HDFS connector. This way you decouple your stream processing from writing the data elsewhere.

Upvotes: 3

Related Questions