Haravikk
Haravikk

Reputation: 3280

Fastest Method for Counting Files in Directory Hierarchy

Quite simply I want to get a count of files in a directory (and all nested sub-directories) as quickly as possible.

I know how to do this using find paired with wc -l and similar methods, however these are extremely slow and they pass through each file entry in each directory and count them that way.

Is this the fastest method, or are there alternatives? For example; I don't need to find specific types of files, so I'm fine with grabbing symbolic-links, hidden files etc. if I can get the file-count more quickly by counting everything with no further processing involved.

Upvotes: 0

Views: 70

Answers (2)

Igor Chubin
Igor Chubin

Reputation: 64563

The fastest methos is to use locate + wc or similar. It can't be faster. The main disatvantage of the method that it counts not actual files, but the files that are in the locate's database. And this database can be alread 1 day old.

So it depends on your task: if it tolerates delays, I would prefer locate.

On my superfast SSD-based machine:

$ time find /usr | wc -l
156610

real    0m0.158s
user    0m0.076s
sys     0m0.072s

$ time locate /usr | wc -l
156612

real    0m0.079s
user    0m0.068s
sys     0m0.004s

On a normal machine the difference will be much much bigger.

How often is the locate database updated depends on the configuration of the host. By default, it is updated each day (it is made using cron). But you can configure the system so, that the script will run every hour or even frquently. Of course, you can run it not periodically, but on demand (I thank you William Pursell for the hint).

Upvotes: 1

anubhava
anubhava

Reputation: 785196

Try this script as an alternative:

find . -type d -exec bash -c 'd="{}"; arr=($d/*); echo "$d:${#arr[@]}"' \;

In my quick basic testing it came faster than wc -l

Upvotes: 0

Related Questions