rakesh kumar
rakesh kumar

Reputation: 61

what is Hadoop block abstraction. Need more details on this

I was going through Hadoop Definitive guide and was not cleared with below concepts.

  1. Block abstraction, can someone elaborate more on this.

  2. Making the unit of abstraction a block rather than a file simplifies the storage subsystem.

    a.) what is unit of abstraction a block ?

    b.) How to make unit of abstraction?

    c.) How does it simplifies storage subsystem ?

Upvotes: 1

Views: 2908

Answers (2)

Lester Martin
Lester Martin

Reputation: 331

HDFS in some ways is just another filesystem and it, like all the others, breaks files into blocks. Key differences here are that the blocks are big (ex: 128MB) instead of something small (ex: 4KB) and each block is replicated on different servers in the larger HDFS architecture.

Most of us don't work directly with blocks, we work with files and one could argue that this "block abstraction" is really for two purposes.

  • First, it let's the storage subsystem (HDFS) scale to massive level by continuing to add servers.
  • Second, it let's the frameworks (like MapReduce, TEZ, HBase, Spark, etc) align their tactical work to these blocks when processing the logical full file.

Upvotes: 1

Ani Menon
Ani Menon

Reputation: 28199

HDFS Block abstraction:

HDFS block size is of 64MB-128MB(usually) and unlike other filesystems, a file smaller than the block size does not occupy the complete block size’s worth of memory.

The block size is kept so large so that less time is made doing disk seeks as compared to the data transfer rate.

Why block abstraction:

  • Files can be bigger than individual disks
  • Filesystem metadata does not need to be associated with each and every block.
  • Simplifies storage management - Easy to figure out the number of blocks which can be stored on each disk.
  • Fault tolerance and storage replication can be easily done on a per-block basis (storage/HA policies can be run on individual blocks).

Upvotes: 2

Related Questions