Mayank Porwal
Mayank Porwal

Reputation: 34086

How to know the exact block size of a file on a Hadoop node?

I have a 1 GB file that I've put on HDFS. So, it would be broken into blocks and sent to different nodes in the cluster.

Is there any command to identify the exact size of the block of the file on a particular node?

Thanks.

Upvotes: 6

Views: 10042

Answers (3)

Karthik
Karthik

Reputation: 1171

You can try:

hdfs getconf -confKey dfs.blocksize

Upvotes: 4

maxteneff
maxteneff

Reputation: 1531

You should use hdfs fsck command:

hdfs fsck /tmp/test.txt -files -blocks

This command will print information about all the blocks of which file consists:

/tmp/test.tar.gz 151937000 bytes, 2 block(s):  OK
0. BP-739546456-192.168.20.1-1455713910789:blk_1073742021_1197 len=134217728 Live_repl=3
1. BP-739546456-192.168.20.1-1455713910789:blk_1073742022_1198 len=17719272 Live_repl=3

As you can see here is shown (len field in every row) actual used capacities of blocks.

Also there are many another useful features of hdfs fsck which you can see at the official Hadoop documentation page.

Upvotes: 10

Aditya W
Aditya W

Reputation: 662

I do not have reputation to comment.

Have a look at documentation page to set various properties, which covers

dfs.blocksize

Apart from configuration change, you can view actual size of file with

hadoop fs -ls fileNameWithPath

e.g.

hadoop fs -ls /user/edureka 

output:

-rwxrwxrwx   1 edureka supergroup     391355 2014-09-30 12:29 /user/edureka/cust

Upvotes: 0

Related Questions