ngi
ngi

Reputation: 146

Spark Streaming ingestion on hdfs

I am trying to ingest data in hdfs using Structured Streaming using this code:

val query = output
            .writeStream
            .format("csv")
            .option("path", "hdfs://hdfs_path")
            .option("checkpointLocation", "checkpoint")
            .start()

But that does not work due to the following error:

Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: user

Does anyone know how to resolve this issue.

Upvotes: 1

Views: 112

Answers (1)

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41957

the error suggests that you are not using hostname and port after hdfs:// but you are giving path as hdfs://user/...

which tells spark that the hostname is user, which is not correct.

So find the hostname of the namenode and use that in the path,

so instead of

.option("path", "hdfs://hdfs_path")

you should be using

.option("path", "hdfs://hostname:port/hdfs_path")

Upvotes: 1

Related Questions