Reputation: 34086
I have a 1 GB
file that I've put on HDFS
. So, it would be broken into blocks and sent to different nodes in the cluster.
Is there any command to identify the exact size of the block of the file on a particular node?
Thanks.
Upvotes: 6
Views: 10042
Reputation: 1531
You should use hdfs fsck
command:
hdfs fsck /tmp/test.txt -files -blocks
This command will print information about all the blocks of which file consists:
/tmp/test.tar.gz 151937000 bytes, 2 block(s): OK
0. BP-739546456-192.168.20.1-1455713910789:blk_1073742021_1197 len=134217728 Live_repl=3
1. BP-739546456-192.168.20.1-1455713910789:blk_1073742022_1198 len=17719272 Live_repl=3
As you can see here is shown (len
field in every row) actual used capacities of blocks.
Also there are many another useful features of hdfs fsck
which you can see at the official Hadoop documentation page.
Upvotes: 10
Reputation: 662
I do not have reputation to comment.
Have a look at documentation page to set various properties, which covers
dfs.blocksize
Apart from configuration change, you can view actual size of file with
hadoop fs -ls fileNameWithPath
e.g.
hadoop fs -ls /user/edureka
output:
-rwxrwxrwx 1 edureka supergroup 391355 2014-09-30 12:29 /user/edureka/cust
Upvotes: 0