Reputation: 319
Can someone highlight the technical details and when to use which.
Upvotes: 8
Views: 4716
Reputation: 20826
In the current HDFS(0.23.3) implementation, hflush
and hsync
is the same. hsync invokes hflush
. hflush
guarantees that flushed data become visible to new readers. It is not guaranteed that data has been flushed to persistent store on the datanode. So using hflush
may lost some data if the datanode failures happen. hsync
is designed to guarantee that all data write to the disk device but is not implemented now.
In the alpha HDFS 2.0.*, hsync is implemented correctly.
You can get more details in HBase, HDFS and durable sync.
Upvotes: 9