Reputation: 61
I was going through Hadoop Definitive guide and was not cleared with below concepts.
Block abstraction, can someone elaborate more on this.
Making the unit of abstraction a block rather than a file simplifies the storage subsystem.
a.) what is unit of abstraction a block ?
b.) How to make unit of abstraction?
c.) How does it simplifies storage subsystem ?
Upvotes: 1
Views: 2908
Reputation: 331
HDFS in some ways is just another filesystem and it, like all the others, breaks files into blocks. Key differences here are that the blocks are big (ex: 128MB) instead of something small (ex: 4KB) and each block is replicated on different servers in the larger HDFS architecture.
Most of us don't work directly with blocks, we work with files and one could argue that this "block abstraction" is really for two purposes.
Upvotes: 1
Reputation: 28199
HDFS Block abstraction:
HDFS block size is of 64MB-128MB(usually) and unlike other filesystems, a file smaller than the block size does not occupy the complete block size’s worth of memory.
The block size is kept so large so that less time is made doing disk seeks as compared to the data transfer rate.
Why block abstraction:
Upvotes: 2