nobody
nobody

Reputation: 11080

View gzipped file content in hadoop

How can I decompress and view few lines of a compressed file in hdfs. The below command displays the last few lines of the compressed data

hadoop fs -tail /myfolder/part-r-00024.gz

Is there a way I can use the -text command and pipe the output to tail command? I tried this but this doesn't work.

hadoop fs -text /myfolder/part-r-00024.gz > hadoop fs -tail /myfolder/

Upvotes: 7

Views: 18257

Answers (4)

anne
anne

Reputation: 127

Use gunzip to view the compressed file contents:

 hdfs dfs -cat /path/filename.gz | gunzip

Upvotes: 0

leo9r
leo9r

Reputation: 2047

The following will show you the specified number of lines without decompressing the whole file:

hadoop fs -cat /hdfs_location/part-00000.gz | zcat | head -n 20

The following will page the file, also without first decompressing the whole of it:

hadoop fs -cat /hdfs_location/part-00000.gz | zmore

Upvotes: 19

nobody
nobody

Reputation: 11080

I ended up writing a pig script.

A = LOAD '/myfolder/part-r-00024.gz' USING PigStorage('\t');
B = LIMIT A 10;
DUMP B;

Upvotes: 1

mattinbits
mattinbits

Reputation: 10428

Try the following, should work as long as your file isn't too big (since the whole thing will be decompressed):

hadoop fs -text /myfolder/part-r-00024.gz | tail

Upvotes: 4

Related Questions