Reputation: 469
What is the best practice for printing a top 10 list of largest files in a POSIX shell? There has to be something more elegant than my current solution:
DIR="."
N=10
LIMIT=512000
find $DIR -type f -size +"${LIMIT}k" -exec du {} \; | sort -nr | head -$N | perl -p -e 's/^\d+\s+//' | xargs -I {} du -h {}
where LIMIT is a file size threshold to limit the results of find.
Upvotes: 7
Views: 3874
Reputation: 360325
Edit:
Using Gnu utilities (du
and sort
):
du -0h | sort -zrh | tr '\0' '\n'
This uses a null delimiter to pass information between du
and sort
and uses tr
to convert the nulls to newlines. The nulls allow this pipeline to process filenames which may include newlines. Both -h
options cause the output to be in human-readable form.
Original:
This uses awk
to create extra columns for sort keys. It only calls du
once. The output should look exactly like du
.
I've split it into multiple lines, but it can be recombined into a one-liner.
du -h |
awk '{printf "%s %08.2f\t%s\n",
index("KMG", substr($1, length($1))),
substr($1, 0, length($1)-1), $0}' |
sort -r | cut -f2,3
Explanation:
Try it without the cut
command to see what it's doing.
Edit:
Here's a version which does the sorting within the AWK script and doesn't need cut (requires GNU AWK (gawk
) for asorti
support):
du -h0 |
gawk 'BEGIN {RS = "\0"}
{idx = sprintf("%s %08.2f %s",
index("KMG", substr($1, length($1))),
substr($1, 0, length($1)-1), $0);
lines[idx] = $0}
END {c = asorti(lines, sorted);
for (i = c; i >= 1; i--)
print lines[sorted[i]]}'
Edit: Added null record separation in order to handle potential filenames which include newlines. Requires GNU du
and gawk
.
Upvotes: 7