Lawan subba
Lawan subba

Reputation: 690

Read File directly from HDFS

Is there a way to read any file format from HDFS directly by using the HDFS path, instead of having to pull the file locally from HDFS and read it.

Upvotes: 3

Views: 26577

Answers (4)

OneCricketeer
OneCricketeer

Reputation: 191894

You have to pull the entire file. Whether you use cat or text commands, the entire file is still being streamed to your shell. There's just no remnant of the file when the command ends. So, if you plan on inspecting the file a few times, it's better to get it

As an hdfs client, you must contact the namenode to acquire all block locations for a particular file.

Upvotes: 2

Mobility
Mobility

Reputation: 3305

hdfs dfs -cat /path or hadoop fs -cat /path

Upvotes: 4

philantrovert
philantrovert

Reputation: 10092

You can use cat command on HDFS to read regular text files.

hdfs dfs -cat /path/to/file.csv

To read compressed files like gz, bz2 etc, you can use:

hdfs dfs -text /path/to/file.gz

These are the two read methods that Hadoop supports natively using FsShell comamnds. For other complex file types, you will have to use a more complex way, like, a Java program or something along those lines.

Upvotes: 6

jedijs
jedijs

Reputation: 563

You can try with hdfs dfs -cat

Usage: hdfs dfs -cat [-ignoreCrc] URI [URI ...]

hdfs dfs -cat /your/path

Upvotes: 2

Related Questions