Patrick
Patrick

Reputation: 2709

Restore file from HDFS after namenode delete

My namenode server was hacked this weekend and /usr/local/hadoop directory no longer exists. Is it still possible to recover a file that is stored on HDFS? Datanodes are accessible and each contains somewhere in the hierarchy blk_{...} data.

Upvotes: 0

Views: 1293

Answers (1)

xkrogen
xkrogen

Reputation: 654

If you don't have any copy/backup of the name dir, recovering the data will be quite a difficult task. The datanodes are not aware of any concept of a file, only blocks. All of the data exists in those blocks but you would have to manually reconstruct files from their blocks. If you have some specific files of very high importance and not that much data overall you may be able to sift through the blocks to find what you're looking for but I'm not aware of anything better than that.

This is why there are a number of ways to redundantly store multiple copies of the namespace, e.g. by specifying multiple directories in the dfs.namenode.name.dir property, and using either a Secondary or a Standby Namenode (see https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Secondary_NameNode), which act as a remote location storing a copy of the namespace.

Upvotes: 2

Related Questions