Url for HDFS file system

Question

I have some data in HDFS /user/Cloudera/Test/*. I am very well able to see the records by running hdfs -dfs -cat Test/*.

Now the same file, I need it to be read as RDD in scala. I have tried the following in scala shell.

val file = sc.textFile("hdfs://quickstart.cloudera:8020/user/Cloudera/Test")

Then I have written some filter and for loop to read the words. But when I use the Println at last, it says file not found.

Can anyone please help me know what would be the HDFS url in this case. Note: I am using Cloudera CDH5.0 VM

user7432598 · Accepted Answer

Instead of using "quickstart.cloudera" and the port, use just the ip address:

val file = sc.textFile("hdfs:///user/Cloudera/Test")

Answers (2)