mikey
mikey

Reputation: 135

List small files in a directory and output a summary total

I want to (1)identify files in a directory that are under 64 bytes and (2)print out their names and size. The following one-liner does the job:

find . -size -64c -exec ls -lh {} \;|awk '{print $5, $9}'

This prints out a list of files, along with their size. Can I easily extend this one-liner to also print out the total number of files found. In effect to pipe the file list into a wc -l command?

Upvotes: 1

Views: 216

Answers (3)

dawg
dawg

Reputation: 103884

You can use a short Python script as well with pathlib to replace the find and the awk:

python3 -c 'from pathlib import Path
cnt, fsum=0,0
for fn in (x for x in Path().glob("**/*") if x.is_file() and x.stat().st_size<64):
    size=fn.stat().st_size
    print(f"{str(fn)}: {size:,} bytes")
    fsum+=size
    cnt+=1
    
print(f"Total files: {cnt:,} Total bytes: {fsum:,}")'

Some key elements:

  1. An empty argument to Path() refers to the local directory . You could also use Path(".") to use the current directory. You could also specify a specific path there as the root of the file tree.
  2. The glob glob("**/*") is a recursive directory walk for all files in or below the one specified by Path -- similar to find

Or in ruby:

ruby -e '
fsum=0; cnt=0
Dir.glob("**/*").each{|fn| 
    st=File.stat(fn)
    if (st.size<64 && st.file?)
        puts "#{fn}: #{st.size} bytes" 
        cnt+=1
        fsum+=st.size
    end
}
puts "Total files: #{cnt} Total bytes: #{fsum}"'

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203674

Consider doing this instead:

find . -size -64c -printf '%s %p\n' | awk '1; END{print NR}'

That'd mean you're not spawning a subshell to call ls on every file found so it'd be much more efficient and could be easily tweaked to handle file names that include any characters, including newlines, e.g. with GNU tools to allow NUL separating the file names:

find . -size -64c -printf '%s %p\0' | awk -v RS='\0' '1; END{print NR}'

Add -v ORS='\0' if you want the awk output to be NUL-separated too.

If you don't want to do that, at least change:

find . -size -64c -exec ls -lh {} \;

to:

find . -size -64c -exec ls -lh {} +

so ls is called on groups of files instead of one at a time.

Note that as @dawg mentions in the comments -printf may be GNU-only.

Upvotes: 2

konsolebox
konsolebox

Reputation: 75498

Can I easily extend this one-liner to also print out the total number of files found.

Yes, by printing NR at the END block:

... | awk '{ print $5, $9 } END { print NR }'

You can also include a suffix:

... END { print "Total number of files found: " NR }'

Upvotes: 0

Related Questions