Reputation: 1852
I want to start a hadoop streaming job, but it fails complaining:
15/05/19 23:17:34 ERROR streaming.StreamJob: Error Launching job : The NameSpace quota (directories and files) of directory /user/myname is exceeded: quota=1000000 file count=1000001
I tried deleting some files using hdfs dfs -rm -r -f files
which reports that files are moved to trash. I then tried hdfs dfs -expunge
and I get back:
15/05/19 23:12:32 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
15/05/19 23:12:33 INFO fs.TrashPolicyDefault: Created trash checkpoint: /user/myname/.Trash/150519231233
But I still get the original error. What should I do?
Upvotes: 3
Views: 12182
Reputation: 73366
If I were you, I would follow the other answer...
If you really know what you are doing, then you could do:
[gsamaras@gwta3000 ~]$ hadoop fs -rm -f -r -skipTrash /path/to/dirToBeDeleted
Deleted /path/to/dirToBeDeleted
which I assembled after reading: How to delete a non-empty directory in Terminal? and the rest..
When you delete a file or a directory, it goes into Trash, but when you delete Trash, there is an interval (that is configurable and depends on your setup, mine's is 1h), which has to pass by, so that the actual deletion occurs.
The idea is that you might delete something important (or something generated after much computing power is used) by accident and that configuration gives you the opportunity to recover your data.
If you are not sure how to proceed, then I would advice to wait for at least an hour and then try again, otherwise..
From that link, there is this list that proposes some ideas:
-Dfs.trash.interval=0
when deleting large
directory/user/<username>/.Trash
from the quota.Trash
out of /user directory
. Maybe /Trash/<username>
and set different quota.-rm
/rmr
fail with quota, automatically delete them.-rmr
-skipTrash
for force delete.Upvotes: 5
Reputation: 1852
It turns out I only needed to wait some hours, until everything settles down!
Upvotes: 1