Reputation: 827
How can I search through a massive amount of data (28TB) to find the largest 10 files in the past 24 hours?
From the current answers below I've tried:
$ find . -type f -mtime -1 -printf "%p %s\n" | sort -k2nr | head -5
This command takes over 24 hours which defeats the purpose of searching for most recently modified in the past 24 hours. Are there any solutions known to be faster than the one above that can drastically cut search time? Solutions to monitor the system also will not work as there is simply too much to monitor and doing such could cause performance issues.
Upvotes: 2
Views: 3105
Reputation: 140196
you can use the standard yet very powerful find
command like this (start_directory
is the directory where to scan files)
find start_directory -type f -mtime -1 -size +3000G
-mtime -1
option: files modified 1 day before or less
-size +3000G
option: files of size at least 3 Gb
Upvotes: 0
Reputation: 67507
something like this?
$ find . -type f -mtime -1 -printf "%p %s\n" | sort -k2nr | head -5
top 5 modified files by size in the past 24 hours.
Upvotes: 3