Reputation: 3280
Quite simply I want to get a count of files in a directory (and all nested sub-directories) as quickly as possible.
I know how to do this using find
paired with wc -l
and similar methods, however these are extremely slow and they pass through each file entry in each directory and count them that way.
Is this the fastest method, or are there alternatives? For example; I don't need to find specific types of files, so I'm fine with grabbing symbolic-links, hidden files etc. if I can get the file-count more quickly by counting everything with no further processing involved.
Upvotes: 0
Views: 70
Reputation: 64563
The fastest methos is to use locate
+ wc
or similar. It can't be faster. The main disatvantage of the method that it counts not actual files, but the files that are in the locate's database. And this database can be alread 1 day old.
So it depends on your task: if it tolerates delays, I would prefer locate
.
On my superfast SSD-based machine:
$ time find /usr | wc -l
156610
real 0m0.158s
user 0m0.076s
sys 0m0.072s
$ time locate /usr | wc -l
156612
real 0m0.079s
user 0m0.068s
sys 0m0.004s
On a normal machine the difference will be much much bigger.
How often is the locate database updated depends on the configuration of the host.
By default, it is updated each day (it is made using cron
). But you can configure the system so, that the script will run every hour or even frquently. Of course, you can run it not periodically, but on demand (I thank you William Pursell for the hint).
Upvotes: 1
Reputation: 785196
Try this script as an alternative:
find . -type d -exec bash -c 'd="{}"; arr=($d/*); echo "$d:${#arr[@]}"' \;
In my quick basic testing it came faster than wc -l
Upvotes: 0