Inder Singh
Inder Singh

Reputation: 319

Differences between hflush & hsync api's in HDFS

Can someone highlight the technical details and when to use which.

Upvotes: 8

Views: 4716

Answers (1)

zsxwing
zsxwing

Reputation: 20826

In the current HDFS(0.23.3) implementation, hflush and hsync is the same. hsync invokes hflush. hflush guarantees that flushed data become visible to new readers. It is not guaranteed that data has been flushed to persistent store on the datanode. So using hflush may lost some data if the datanode failures happen. hsync is designed to guarantee that all data write to the disk device but is not implemented now.

In the alpha HDFS 2.0.*, hsync is implemented correctly.

You can get more details in HBase, HDFS and durable sync.

Upvotes: 9

Related Questions