Reputation: 59586
I'm doing a normal du
on some huge directory. It probably takes ages as the storage also is network attached.
I would like to see the progress before the end of the process so that I can already estimate what's going on. At any given time I'd like to see the already collected sum of disk usage as du
counts it. I found no option for du
to provide this. Did I miss something? Is there an easy way to achieve this?
I imagined something like this:
du -ba . | { s=0; while read a b; do ((s+=a)); echo $s; done; }
This would sum up the output but of course this would sum up also the accumulated directory sizes (effectively multiplying the counted sizes). I found no option to just mention the files in the output. On the other hand, using find -type f -printf "%s %p\n"
instead would count hardlinks multiple times.
Is there any typical tool to achieve what I want or a simple fix to the presented script? Currently I consider writing a Python script for this but have the feeling that might be overkill.
Upvotes: 3
Views: 3222
Reputation: 13097
I think that in order to profit from the performance of the du
utility vs. any custom script, one could just:
tar -xf coreutils-8.30.tar.xz && cd coreutils-8.30
./configure --prefix=/custom/location/of/modified/coreutils
./src/du.c
add after line 666
the statement print_size (&tot_dui, _("total"));
The end of the process_file
function would look like:
if ((IS_DIR_TYPE (info) && level <= max_depth)
|| (opt_all && level <= max_depth)
|| level == 0)
{
/* Print or elide this entry according to the --threshold option. */
uintmax_t v = opt_inodes ? dui_to_print.inodes : dui_to_print.size;
if (opt_threshold < 0
? v <= -opt_threshold
: v >= opt_threshold)
print_size (&dui_to_print, file);
print_size (&tot_dui, _("total")); /* extra statement */
}
return ok;
make install
This would make the modified du
to report the total size after each file, i.e., the output could look like:
129K ./bin/dirname
33M total
132K ./bin/uname
33M total
207K ./bin/sha1sum
33M total
156K ./bin/truncate
33M total
311K ./bin/pr
34M total
172K ./bin/printf
34M total
138K ./bin/pathchk
34M total
Upvotes: 1
Reputation: 2845
If you can download it, ncdu
is a nice program that does the same as du, but with a nice interface including how far your progress is.
On Debian, Ubuntu, etc, you can install it with
sudo apt install ncdu
Upvotes: 0
Reputation: 59586
I came up with a small bash one-liner to solve my issue. It's not as nice as using du
properly but it give progress information and it doesn't count hardlinks twice.
I give it here in one line and spread out to make it clearer:
find -type f -printf "%s %i %p\n" | { sum=0; declare -A inodes; while read size inode path; do [ "${inodes[$inode]}" != 1 ] && { inodes[$inode]=1; ((sum+=size)); echo "$sum $size $path"; }; done; }
And the same nicely formatted:
find -type f -printf "%s %i %p\n" | {
sum=0
declare -A inodes
while read size inode path
do
[ "${inodes[$inode]}" != 1 ] && {
inodes[$inode]=1
((sum+=size))
echo "$sum $size $path"
}
done
}
Upvotes: 1
Reputation: 3600
Maybe below command give you a hint to progress ahead
ls -laR | awk '{ total += $6;if(FNR%1000 == 0)print total;}; END { print total }'
In the awk statement, you can various condition to check if it is a directory or links.
And FNR%1000
will print the size progress every hundred line it reads. Instead of ls
, you can use find
Upvotes: 0