Reputation: 2709
My namenode server was hacked this weekend and /usr/local/hadoop directory no longer exists. Is it still possible to recover a file that is stored on HDFS? Datanodes are accessible and each contains somewhere in the hierarchy blk_{...} data.
Upvotes: 0
Views: 1293
Reputation: 654
If you don't have any copy/backup of the name dir, recovering the data will be quite a difficult task. The datanodes are not aware of any concept of a file, only blocks. All of the data exists in those blocks but you would have to manually reconstruct files from their blocks. If you have some specific files of very high importance and not that much data overall you may be able to sift through the blocks to find what you're looking for but I'm not aware of anything better than that.
This is why there are a number of ways to redundantly store multiple copies of the namespace, e.g. by specifying multiple directories in the dfs.namenode.name.dir
property, and using either a Secondary or a Standby Namenode (see https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Secondary_NameNode), which act as a remote location storing a copy of the namespace.
Upvotes: 2