Reputation: 157
When using the Cloudera VM how can you access information in the HDFS? I know there isn't a direct path to the HDFS but I also don't see how to dynamically access it.
After creating a Hive Table through the Hive CLI I attempted to load some data from a file located in the HDFS:
load data inpath '/test/student.txt' into table student;
But then I just get this error:
FAILED: SemanticException Line 1:17 Invalid path ''/test/student.txt'': No files matching path hdfs://quickstart.cloudera:8020/test/student.txt
I also tried to just load data not in the HDFS into a Hive Table like so:
load data inpath '/home/cloudera/Desktop/student.txt' into table student;
However that just produced this error:
FAILED: SemanticException Line 1:17 Invalid path ''/home/cloudera/Desktop/student.txt'': No files matching path hdfs://quickstart.cloudera:8020/home/cloudera/Desktop/student.txt
Once again I see it trying to access data with the root of hdfs://quickstart.cloudera:8020
and I'm not sure what that is, but it doesn't seem to be the root directory for the HDFS.
I'm not sure what I'm doing wrong but I made sure the file is located in the HDFS so I don't know why this error is coming up or how to fix it.
Upvotes: 2
Views: 1856
Reputation: 192043
how can you access information in the HDFS
Well, you certainly don't need to use Hive to do it. hdfs dfs
commands are how you interact with HDFS.
I'm not sure what that is, but it doesn't seem to be the root directory for the HDFS
It is the root of HDFS. quickstart.cloudera
is the hostname of the VM. Port 8020 is the HDFS port.
Your exceptions are from the difference in using the LOCAL
keyword.
What you're doing
LOAD DATA INPATH <hdfs location>
VS what you seem to be wanting
LOAD DATA LOCAL INPATH <local file location>
Or if the files are in HDFS, it's not clear how you have put files into it, but HDFS definitely doesn't have a /home folder or a Desktop, so the second error at least makes sense.
Anyways, hdfs dfs -put /test/students.text /test/
is one way to upload your file, assuming the hdfs:///test
folder already exists. Otherwise, hdfs dfs -put /test/students.text /test
renames your file to /test
on HDFS
Note: You can create an EXTERNAL TABLE over an HDFS directory, you don't need to use the LOAD DATA command.
Upvotes: 1