Reputation: 8957
I am trying to get the byte count for the specific file in a HDFS directory.
I tried to use fs.getFileStatus()
,but i do not see any methods for getting byte count of the file, i can see only getBlockSize()
method.
Is there any way can i get the byte count of a specific file in HDFS?
Upvotes: 1
Views: 1353
Reputation: 2921
fs.getFileStatus()
returns a FileStatus objects which has a method getLen()
this will return "length of this file, in bytes." Maybe you should haev a closer look on this: https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileStatus.html.
BUT be aware that the file size is not that important on HDFS. The files will be organized in so called data-blocks each datablock is by default 64 MB. So if you deal with many small files (which is one big anti-pattern on HDFS) you may have less capacity than you expect. See this link for more details:
https://hadoop.apache.org/docs/r2.6.1/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Data_Blocks
Upvotes: 1