Lavanya varma
Lavanya varma

Reputation: 75

how t restore a hdfs deleted file

I was asked with below question .

Interviewer: how to recover a deleted file in hdfs. Me: from trash directory we can copy/move back to original directory. Interviewer: Is there any other way except from trash recovery. Me: I said No.

So my question is , whether there is really any way to recover deleted files or interviewer just asked me to test my confidence.

I have found below way to recover which is different from hdfs -cp/mv but it is also getting file from trash .

hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true -D dfs.checksum.tpe=CRC32C -m 10 -pb -update /users/vijay/.Trash/ /application/data/vijay;

Upvotes: 0

Views: 1687

Answers (2)

Data Wizard
Data Wizard

Reputation: 1

During the interview, you were asked about recovering deleted files in HDFS. Your answer about retrieving data from the trash directory was correct.

However, it's important to note that there are other methods to recover files in HDFS, such as:

using snapshots (if enabled) replication factor professional data recovery software

Your discovery of a command that retrieves files from the trash shows a useful technique that's frequently used in HDFS systems.

Although your response was accurate, showing flexibility and problem-solving abilities by being open to considering different options and sharing your ideas throughout the interview would have been a plus.

Upvotes: 0

ZeroGuilty
ZeroGuilty

Reputation: 66

Hadoop has provided HDFS snapshot (SnapShot) function since version 2.1.0 You can try to use it

First,Create SnapShot

hdfs dfsadmin -allowSnapshot /user/hdfs/important
hdfs dfs -createSnapshot /user/hdfs/important important-snapshot

Next,try to delete one file

hdfs dfs -rm -r /user/hdfs/important/important-file.txt

Final,restore it

hdfs dfs -ls /user/hdfs/important/.snapshot/
hdfs dfs -cp /user/hdfs/important/.snapshot/important-snapshot/important-file.txt /user/hdfs/important/
hdfs dfs -cat /user/hdfs/important/important-file.txt

P.S:You have to use CP Command (not MV Command) to recover deleted file in this way Because the deleted file in snapshot is only-read file

Wish my answer can help you

Upvotes: 4

Related Questions