Meg
Meg

Reputation: 131

File blocks on HDFS

Does Hadoop guarantee that different blocks from same file will be stored on different machines in the cluster? Obviously replicated blocks will be on different machines.

Upvotes: 1

Views: 547

Answers (4)

Thomas Jungblut
Thomas Jungblut

Reputation: 20969

Well Hadoop does not guarantee that. Since that is a huge loss of security, if you are requesting a file within a job, a downed datanode will cause the complete job to fail. Just because a block is not available. Can't imagine a usecase for your question, maybe you can tell a bit more to understand what your intention really was.

Upvotes: 0

Girish Rao
Girish Rao

Reputation: 2669

On the contrary I think. Setting aside replication, each datanode stores each block of data as its own file in the local file system.

Upvotes: 0

CanSpice
CanSpice

Reputation: 35828

No. If you look at the HDFS Architecture Guide, you'll see (in the diagram) that file part-1 has a replication factor of 3, and is made up of three blocks labelled 2, 4, and 5. Note how blocks 2 and 5 are on the same Datanode in one case.

Upvotes: 1

Related Questions