Shankar
Shankar

Reputation: 8957

HDFS File System - How to get the byte count for a specific file in a directory

I am trying to get the byte count for the specific file in a HDFS directory.

I tried to use fs.getFileStatus() ,but i do not see any methods for getting byte count of the file, i can see only getBlockSize() method.

Is there any way can i get the byte count of a specific file in HDFS?

Upvotes: 1

Views: 1353

Answers (2)

TobiSH
TobiSH

Reputation: 2921

fs.getFileStatus() returns a FileStatus objects which has a method getLen() this will return "length of this file, in bytes." Maybe you should haev a closer look on this: https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileStatus.html.

BUT be aware that the file size is not that important on HDFS. The files will be organized in so called data-blocks each datablock is by default 64 MB. So if you deal with many small files (which is one big anti-pattern on HDFS) you may have less capacity than you expect. See this link for more details:

https://hadoop.apache.org/docs/r2.6.1/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Data_Blocks

Upvotes: 1

Shankar
Shankar

Reputation: 8957

We need to use fs.getLen() method to get the file byte count.

Upvotes: 0

Related Questions