Reputation: 21965
Is there a reason why
find . -mindepth 1 -maxdepth 1 | wc -l
is suggested against
ls -1 | wc -l
(or vice-versa ?)
to count the total number of files/directories inside a folder
Notes:
.
\n
in it.Upvotes: 5
Views: 1383
Reputation: 414
May I add some more?
As stated by mogsie the main reason is about performance:
disclosure: I used this solution in production to count entries of a directory with about 300k items
find . -mindepth 1 -maxdepth 1 -printf '.' | wc -m
Basically this prints a dot in the standard output for every fs-entry, then counts the printed characters.
The big advantage on file names is easy to imagine: they are never used; and the other advantage on performance is that no attribute is required to count the files (as you woulod expect from a function that count files in a directory), unless you specify some filter.
If you want to make it start counting and then eventually get back and see how many items have been found, you can also redirect the standard output to a file (eventually in a tmpfs, so you never have to write on disk), then detach the shell and eventually get back and count the characters in the file:
nohup find . -mindepth 1 -maxdepth 1 -printf '.' > /tmp/count.txt &
Then simply counting the dots in the file will give you the current count
wc -m /tmp/count.txt
... and if you are eager to get the current counter's updates
watch wc -m /tmp/count.txt
Upvotes: 2
Reputation: 4156
The reason find(1)
is preferred to ls(1)
is that
ls
defaults to sorting the list of filesfind
has no sorting capabilitySorting can be extremely memory consuming for large data sets. So even though you can use ls -f
or ls -U
to disable sorting, I find that using find
is safer because I know that the directory listing won't be sorted, no matter what options are passed to it.
In any case, telling the command to print less about each file can help in performance and correctness. Performance because the command can avoid the stat(2)
call and correctness because if you e.g. only print the inode, you'll be certain that the name of the file won't affect the output (e.g. line breaks, carriage returns or other odd characters.)
Upvotes: 2
Reputation: 311586
The first command...
find . -mindepth 1 -maxdepth 1 | wc -l
...will list files and directories that start with .
, while your ls
command will not. The equivalent ls
command would be:
ls -A | wc -l
Both will give you the same answers. As folks pointed out in the comments, both of these will give you wrong answers if there are entries that contained embedded newlines, because the above commands are simply counting the number of lines of output.
Here's one way to count the number of files that is independent of filename quirks:
find . -mindepth 1 -maxdepth 1 -print0 | xargs -0i echo | wc -l
This passes the filenames to xargs
with a NUL
terminator, rather than relying on newlines, and then xargs simply prints a blank line for each file, and we count the number of lines of output from xargs
.
Upvotes: 5