Mardoz
Mardoz

Reputation: 1657

Command to list all file types and their average size in a directory

I am working on a specific project where I need to work out the make-up of a large extract of documents so that we have a baseline for performance testing.

Specifically, I need a command that can recursively go through a directory and, for each file type, inform me of the number of files of that type and their average size.

I've looked at solutions like: Unix find average file size, How can I recursively print a list of files with filenames shorter than 25 characters using a one-liner? and https://unix.stackexchange.com/questions/63370/compute-average-file-size, but nothing quite gets me to what I'm after.

Upvotes: 4

Views: 2675

Answers (3)

BMW
BMW

Reputation: 45243

Give you something to start, with below script, you will get a list of file and its size, line by line.

#!/usr/bin/env bash

DIR=ABC
cd $DIR

find . -type f |while read line
do 
  # size=$(stat --format="%s" $line)    # For the system with stat command
  size=$(perl -e 'print -s $ARGV[0],"\n"' $line )  # @Mark Setchell provided the command, but I have no osx system to test it. 
  echo $size $line 
done

Output sample

123 ./a.txt
23 ./fds/afdsf.jpg

Then it is your homework, with above output, you should be easy to get file type and their average size

Upvotes: 2

anubhava
anubhava

Reputation: 785058

This du and awk combination should work for you:

du -a mydir/ | awk -F'[.[:space:]]' '/\.[a-zA-Z0-9]+$/ { a[$NF]+=$1; b[$NF]++ }
     END{for (i in a) print i, b[i], (a[i]/b[i])}' 

Upvotes: 10

Mark Setchell
Mark Setchell

Reputation: 207425

You can use "du" maybe:

du -a -c *.txt

Sample output:

104 M1.txt
8   in.txt
8   keys.txt
8   text.txt
8   wordle.txt
136 total

The output is in 512-byte blocks, but you can change it with "-k" or "-m".

Upvotes: 0

Related Questions