user19930511
user19930511

Reputation: 389

How to recover data if Bulk Load API is used in Java MapReduce?

In production, we use Bulk load API to load data into Hbase tables by passing two arguments to bulk load API (pathToHfile, targetTableName).

pathToHfile ---> Location of Hfiles in hadoop
targetTableName ---> The target table that we want to load

When we use Bulk load API the writes does not happen to WAL file. But WAL files are used to recover the data. So how are we going to recover the data in this case since the data is not getting written to the WAL file?

Upvotes: 0

Views: 31

Answers (1)

shay__
shay__

Reputation: 3990

WAL is used to recover changes that were not written to HFiles (i.e. from crashed MemStore). In bulk loading you are creating the HFiles manually and hand them over to HBase. The actual loading of the new files in HBase is atomic, so no recovery mechanism needed here.

Upvotes: 1

Related Questions