kannanrbk
kannanrbk

Reputation: 7134

Hadoop : How to compute actual file size?

I am using hadoop to store files. I want to know the actual file size.

getFileSystem().getContentSummary(new Path(fileName)).getLength();

It returns the compressed file size. I am using default hadoop codec compression.

How can I compute the actual file size?

Upvotes: 0

Views: 714

Answers (1)

Chris White
Chris White

Reputation: 30089

Unless the compression codec supports storing the uncompressed size in a header / footer of the compressed file there is no way to work out the uncompressed size (other than performing a stream decompress and running via something like dd or count the bytes in java).

GZip for example - the last 4 bytes of the file are the uncompressed size in bytes (assuming it's not more than 4 bytes can represent)

Upvotes: 1

Related Questions