How to recover deleted files in Hadoop HDFS

While working with files in Hadoop HDFS, you may come across situations wherein you accidentally deleted a file or a folder from Hadoop HDFS.

If you did this mistake, then do not worry. You can still recover your file from Trash.

In Hadoop, trash folder is located at the below path

/users/<your username>/.Trash

When you delete any folder or a file, it will be moved to this folder with the original path. You can go to that path by navigating further inside the Trash folder and retrieve it by moving it or copying it back to the original path.

Remember that this will work only if you have deleted the file or folders using the usual -rm command.

If you had used -skiptrash command at the time of deleting the folder then it will be permanently deleted from the folder and you will not be able to recover it from HDFS trash folder.

Also, remember that the files moved to Trash folders are automatically deleted after a certain time period say 30 minutes or whatever is being configured for your Hadoop environment by your Hadoop Administrator. These values are set in site-core.xml file. 

By default, these values are set as zero so that files are not automatically purged from trash and are instead deleted when your Hadoop Administrator deletes them manually as part of periodic maintenance or cleanup activity.

Hope this post helps you recover some of your important files.

Post a Comment