Reputation: 1216
When running a single-node HDFS cluster (or pseudo-distributed mode) with several data directories on separate physical hard disk drives, is it possible to have block replication in case of a disk failure?
I understand that single-node installations are atypical but still would like to know. Everything I read only dealt with node failures but I could find nothing about disk failures in single-node scenarios.
Note : I'm only interested in the possibility of data loss here, not in the availability of the so-called "cluster".
Upvotes: 3
Views: 530
Reputation: 1635
Node failure can be caused just by a disk failure, so every disk failure will cause a node failure which means that the data will be lost if you have a single disk and single node. But if you have two disks on a node you can have two DataNodes on that machine each with a separate disk and then you can have replication. In this case disk failure will not cause a node failure necessarily.
Upvotes: 2