Reputation: 389
In production, we use Bulk load API to load data into Hbase tables by passing two arguments to bulk load API (pathToHfile, targetTableName).
pathToHfile ---> Location of Hfiles in hadoop
targetTableName ---> The target table that we want to load
When we use Bulk load API the writes does not happen to WAL file. But WAL files are used to recover the data. So how are we going to recover the data in this case since the data is not getting written to the WAL file?
Upvotes: 0
Views: 31
Reputation: 3990
WAL is used to recover changes that were not written to HFiles (i.e. from crashed MemStore). In bulk loading you are creating the HFiles manually and hand them over to HBase. The actual loading of the new files in HBase is atomic, so no recovery mechanism needed here.
Upvotes: 1