Reputation: 16555

Is there a bash command which counts files?

Is there a bash command which counts the number of files that match a pattern?

For example, I want to get the count of all files in a directory which match this pattern: log*

Upvotes: 314

Answers (16)

Daniel

Reputation: 5389

This simple one-liner should work in any shell, not just bash:

ls -1q log* | wc -l

ls -1q will give you one line per file, even if they contain whitespace or special characters such as newlines.

The output is piped to wc -l which counts the number of lines.

Upvotes: 410

Léa Gris

Reputation: 19675

This can be done with standard POSIX shell grammar.

Here is a simple count_entries function:

#!/usr/bin/env sh

count_entries()
{
  # Emulating Bash nullglob 
  # If argument 1 is not an existing entry
  if [ ! -e "$1" ]
    # argument is a returned pattern
    # then shift it out
    then shift
  fi
  echo $#
}

for a compact definition:

count_entries(){ [ ! -e "$1" ]&&shift;echo $#;}

Featured POSIX compatible file counter by type:

#!/usr/bin/env sh

count_files()
# Count the file arguments matching the file operator
# Synopsys:
# count_files operator FILE [...]
# Arguments:
# $1: The file operator
#   Allowed values:
#   -a FILE    True if file exists.
#   -b FILE    True if file is block special.
#   -c FILE    True if file is character special.
#   -d FILE    True if file is a directory.
#   -e FILE    True if file exists.
#   -f FILE    True if file exists and is a regular file.
#   -g FILE    True if file is set-group-id.
#   -h FILE    True if file is a symbolic link.
#   -L FILE    True if file is a symbolic link.
#   -k FILE    True if file has its `sticky' bit set.
#   -p FILE    True if file is a named pipe.
#   -r FILE    True if file is readable by you.
#   -s FILE    True if file exists and is not empty.
#   -S FILE    True if file is a socket.
#   -t FD      True if FD is opened on a terminal.
#   -u FILE    True if the file is set-user-id.
#   -w FILE    True if the file is writable by you.
#   -x FILE    True if the file is executable by you.
#   -O FILE    True if the file is effectively owned by you.
#   -G FILE    True if the file is effectively owned by your group.
#   -N FILE    True if the file has been modified since it was last read.
# $@: The files arguments
# Output:
#   The number of matching files
# Return:
#   1: Unknown file operator
{
  operator=$1
  shift
  case $operator in
    -[abcdefghLkprsStuwxOGN])
      for arg; do
        # If file is not of required type
        if ! test "$operator" "$arg"; then
          # Shift it out
          shift
        fi
      done
      echo $#
      ;;
    *)
      printf 'Invalid file operator: %s\n' "$operator" >&2
      return 1
      ;;
  esac
}

count_files "$@"

Example usages:

count_files -f log*.txt
count_files -d datadir*

Alternate count non-directory entries without a loop:

#!/bin/sh

# Creates strings of as many dots as expanded arguments

# dotted string for entries matching star pattern
star=$(printf '%.0s.' ./*)
# dotted string for entries matching star slash pattern (directories)
star_dir=$(printf '%.0s.' ./*/)
# dotted string for entries matching dot star pattern
dot_star=$(printf '%.0s.' ./.*)
# dotted string for entries matching dot star slash pattern (directories)
dot_star_dir=$(printf '%.0s.' ./.*/)

# Print pattern matches count excluding directories matches
printf 'Files count: %d\n' $((
  ${#star} - ${#star_dir} +
  ${#dot_star} - ${#dot_star_dir}
))

Upvotes: 2

Will Vousden

Reputation: 33408

For a recursive search:

find . -type f -name '*.log' -printf x | wc -c

wc -c will count the number of characters in the output of find, while -printf x tells find to print a single x for each result. This avoids any problems with files with odd names which contain newlines etc.

For a non-recursive search, do this:

find . -maxdepth 1 -type f -name '*.log' -printf x | wc -c

Upvotes: 88

Theodore R. Smith

Reputation: 23259

Here is a generic Bash function you can use in your scripts.

    # @see https://stackoverflow.com/a/11307382/430062
    function countFiles {
        shopt -s nullglob
        logfiles=($1)
        echo ${#logfiles[@]}
    }

    FILES_COUNT=$(countFiles "$file-*")

Upvotes: 1

Small Boy

Reputation: 305

An important comment

(not enough reputation to comment)

This is BUGGY:

ls -1q some_pattern | wc -l

If shopt -s nullglob happens to be set, it prints the number of ALL regular files, not just the ones with the pattern (tested on CentOS-8 and Cygwin). Who knows what other meaningless bugs does ls have?

This is CORRECT and much faster:

shopt -s nullglob; files=(some_pattern); echo ${#files[@]};

It does the expected job.

And the running times differ.
The 1st: 0.006 on CentOS, and 0.083 on Cygwin (in case it is used with care).
The 2nd: 0.000 on CentOS, and 0.003 on Cygwin.

Upvotes: 12

jturi

Reputation: 1743

To count everything just pipe ls to word count line:

ls | wc -l

To count with pattern, pipe to grep first:

ls | grep log | wc -l

Upvotes: -3

bballdave025

Reputation: 1438

I've given this answer a lot of thought, especially given the don't-parse-ls stuff. At first, I tried

<WARNING! DID NOT WORK>

du --inodes --files0-from=<(find . -maxdepth 1 -type f -print0) | awk '{sum+=int($1)}END{print sum}'

</WARNING! DID NOT WORK>

which worked if there was only a filename like

touch $'w\nlf.aa'

but failed if I made a filename like this

touch $'firstline\n3 and some other\n1\n2\texciting\n86stuff.jpg'

I finally came up with what I'm putting below. Note I was trying to get a count of all files in the directory (not including any subdirectories). I think it, along with the answers by @Mat and @Dan_Yard , as well as having at least most of the requirements set out by @mogsie (I'm not sure about memory.) I think the answer by @mogsie is correct, but I always try to stay away from parsing ls unless it's an extremely specific situation.

awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'

More readably:

awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'

This is doing a find specifically for files, delimiting the output with a null character (to avoid problems with spaces and linefeeds), then counting the number of null characters. The number of files will be one less than the number of null characters, since there will be a null character at the end.

To answer the OP's question, there are two cases to consider

1) Non-recursive search:

awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -name "log*" -print0) | \
    awk '{sum+=$1}END{print sum}'

2) Recursive search. Note that what's inside the -name parameter might need to be changed for slightly different behavior (hidden files, etc.).

awk -F"\0" '{print NF-1}' < \
  <(find . -type f -name "log*" -print0) | \
    awk '{sum+=$1}END{print sum}'

If anyone would like to comment on how these answers compare to those I've mentioned in this answer, please do.

Note, I got to this thought process while getting this answer.

Upvotes: 2

Moh .S

Reputation: 2090

You can use the -R option to find the files along with those inside the recursive directories

ls -R | wc -l // to find all the files

ls -R | grep log | wc -l // to find the files which contains the word log

you can use patterns on the grep

Upvotes: 4

Maëlan

Reputation: 4212

You can define such a command easily, using a shell function. This method does not require any external program and does not spawn any child process. It does not attempt hazardous ls parsing and handles “special” characters (whitespaces, newlines, backslashes and so on) just fine. It only relies on the file name expansion mechanism provided by the shell. It is compatible with at least sh, bash and zsh.

The line below defines a function called count which prints the number of arguments with which it has been called.

count() { echo $#; }

Simply call it with the desired pattern:

count log*

For the result to be correct when the globbing pattern has no match, the shell option nullglob (or failglob — which is the default behavior on zsh) must be set at the time expansion happens. It can be set like this:

shopt -s nullglob    # for sh / bash
setopt nullglob      # for zsh

Depending on what you want to count, you might also be interested in the shell option dotglob.

Unfortunately, with bash at least, it is not easy to set these options locally. If you don’t want to set them globally, the most straightforward solution is to use the function in this more convoluted manner:

( shopt -s nullglob ; shopt -u failglob ; count log* )

If you want to recover the lightweight syntax count log*, or if you really want to avoid spawning a subshell, you may hack something along the lines of:

# sh / bash:
# the alias is expanded before the globbing pattern, so we
# can set required options before the globbing gets expanded,
# and restore them afterwards.
count() {
    eval "$_count_saved_shopts"
    unset _count_saved_shopts
    echo $#
}
alias count='
    _count_saved_shopts="$(shopt -p nullglob failglob)"
    shopt -s nullglob
    shopt -u failglob
    count'

As a bonus, this function is of a more general use. For instance:

count a* b*          # count files which match either a* or b*
count $(jobs -ps)    # count stopped jobs (sh / bash)

By turning the function into a script file (or an equivalent C program), callable from the PATH, it can also be composed with programs such as find and xargs:

find "$FIND_OPTIONS" -exec count {} \+    # count results of a search

Upvotes: 6

Shuang Liang

Reputation: 7

Here's what I always do:

ls log* | awk 'END{print NR}'

Upvotes: -1

Mat

Reputation: 206909

You can do this safely (i.e. won't be bugged by files with spaces or \n in their name) with bash:

$ shopt -s nullglob
$ logfiles=(*.log)
$ echo ${#logfiles[@]}

You need to enable nullglob so that you don't get the literal *.log in the $logfiles array if no files match. (See How to "undo" a 'set -x'? for examples of how to safely reset it.)

Upvotes: 74

mogsie

Reputation: 4156

Lots of answers here, but some don't take into account

file names with spaces, newlines, or control characters in them
file names that start with hyphens (imagine a file called -l)
hidden files, that start with a dot (if the glob was *.log instead of log*
directories that match the glob (e.g. a directory called logs that matches log*)
empty directories (i.e. the result is 0)
extremely large directories (listing them all could exhaust memory)

Here's a solution that handles all of them:

ls 2>/dev/null -Ubad1 -- log* | wc -l

Explanation:

-U causes ls to not sort the entries, meaning it doesn't need to load the entire directory listing in memory
-b prints C-style escapes for nongraphic characters, crucially causing newlines to be printed as \n.
-a prints out all files, even hidden files (not strictly needed when the glob log* implies no hidden files)
-d prints out directories without attempting to list the contents of the directory, which is what ls normally would do
-1 makes sure that it's on one column (ls does this automatically when writing to a pipe, so it's not strictly necessary)
2>/dev/null redirects stderr so that if there are 0 log files, ignore the error message. (Note that shopt -s nullglob would cause ls to list the entire working directory instead.)
wc -l consumes the directory listing as it's being generated, so the output of ls is never in memory at any point in time.
-- File names are separated from the command using -- so as not to be understood as arguments to ls (in case log* is removed)

The shell will expand log* to the full list of files, which may exhaust memory if it's a lot of files, so then running it through grep is be better:

ls -Uba1 | grep ^log | wc -l

This last one handles extremely large directories of files without using a lot of memory (albeit it does use a subshell). The -d is no longer necessary, because it's only listing the contents of the current directory.

Upvotes: 92

z atef

Reputation: 7709

Here is my one liner for this.

 file_count=$( shopt -s nullglob ; set -- $directory_to_search_inside/* ; echo $#)

Upvotes: 11

Dan Yard

Reputation: 177

The accepted answer for this question is wrong, but I have low rep so can't add a comment to it.

The correct answer to this question is given by Mat:

shopt -s nullglob
logfiles=(*.log)
echo ${#logfiles[@]}

The problem with the accepted answer is that wc -l counts the number of newline characters, and counts them even if they print to the terminal as '?' in the output of 'ls -l'. This means that the accepted answer FAILS when a filename contains a newline character. I have tested the suggested command:

ls -l log* | wc -l

and it erroneously reports a value of 2 even if there is only 1 file matching the pattern whose name happens to contain a newline character. For example:

touch log$'\n'def
ls log* -l | wc -l

Upvotes: 14

mogsie

Reputation: 4156

If you have a lot of files and you don't want to use the elegant shopt -s nullglob and bash array solution, you can use find and so on as long as you don't print out the file name (which might contain newlines).

find -maxdepth 1 -name "log*" -not -name ".*" -printf '%i\n' | wc -l

This will find all files that match log* and that don't start with .* — The "not name .*" is redunant, but it's important to note that the default for "ls" is to not show dot-files, but the default for find is to include them.

This is a correct answer, and handles any type of file name you can throw at it, because the file name is never passed around between commands.

But, the shopt nullglob answer is the best answer!

Upvotes: 8

nudzo

Reputation: 18594

ls -1 log* | wc -l

Which means list one file per line and then pipe it to word count command with parameter switching to count lines.

Upvotes: 0

Is there a bash command which counts files?

Answers (16)

An important comment

Related Questions