Dan
Dan

Reputation: 10171

Measure disk space of certain file types in aggregate

I have some files across several folders:

/home/d/folder1/a.txt
/home/d/folder1/b.txt
/home/d/folder1/c.mov
/home/d/folder2/a.txt
/home/d/folder2/d.mov
/home/d/folder2/folder3/f.txt

How can I measure the grand total amount of disk space taken up by all the .txt files in /home/d/?

I know du will give me the total space of a given folder, and ls -l will give me the total space of individual files, but what if I want to add up all the txt files and just look at the space taken by all .txt files in one giant total for all .txt in /home/d/ including both folder1 and folder2 and their subfolders like folder3?

Upvotes: 28

Views: 20531

Answers (13)

Kevin E
Kevin E

Reputation: 3466

There are several potential problems with the accepted answer:

  1. it does not descend into subdirectories (without relying on non-standard shell features like globstar)
  2. in general, as pointed out by Dennis Williamson below, you should avoid parsing the output of ls
    • namely, if the user or group (columns 3 and 4) have spaces in them, column 5 will not be the file size
  3. if you have a million such files, this will spawn two million subshells, and it'll be sloooow

As proposed by ghostdog74, you can use the GNU-specific -printf option to find to achieve a more robust solution, avoiding all the excessive pipes, subshells, Perl, and weird du options:

# the '%s' format string means "the file's size"
find . -name "*.txt" -printf "%s\n" \
  | awk '{sum += $1} END{print sum " bytes"}'

Yes, yes, solutions using paste or bc are also possible, but not any more straightforward.

On macOS, you would need to use Homebrew or MacPorts to install findutils, and call gfind instead. (I see the "linux" tag on this question, but it's also tagged "unix".)

Without GNU find, you can still fall back to using du:

find . -name "*.txt" -exec du -k {} + \
  | awk '{kbytes+=$1} END{print kbytes " Kbytes"}'

…but you have to be mindful of the fact that du's default output is in 512-byte blocks for historical reasons (see the "RATIONALE" section of the man page), and some versions of du (notably, macOS's) will not even have an option to print sizes in bytes.

Many other fine solutions here (see Barn's answer in particular), but most suffer the drawback of being unnecessarily complex or depending too heavily on GNU-only features—and maybe in your environment, that's OK!

Upvotes: 0

ennuikiller
ennuikiller

Reputation: 46965

This will do it:

total=0
for file in *.txt
do
    space=$(ls -l "$file" | awk '{print $5}')
    let total+=space
done
echo $total

Upvotes: 4

Dennis Williamson
Dennis Williamson

Reputation: 360325

Here's a way to do it (in Linux, using GNU coreutils du and Bash syntax), avoiding bad practice:

total=0
while read -r line
do
    size=($line)
    (( total+=size ))
done < <( find . -iname "*.txt" -exec du -b {} + )
echo "$total"

If you want to exclude the current directory, use -mindepth 2 with find.

Another version that doesn't require Bash syntax:

find . -iname "*.txt" -exec du -b {} + | awk '{total += $1} END {print total}'

Note that these won't work properly with file names which include newlines (but those with spaces will work).

Upvotes: 6

EvilRick
EvilRick

Reputation: 184

Simple:

du -ch *.txt

If you just want the total space taken to show up, then:

du -ch *.txt | tail -1

Upvotes: 16

John C. Worsley
John C. Worsley

Reputation: 1

For anyone wanting to do this with macOS at the command line, you need a variation based on the -print0 argument instead of printf. Some of the above answers address that but this will do it comprehensively by extension:

    find . -type f -print0 | xargs -0 stat -f "%N %i" |
  awk '{
      PARTSCOUNT=split( $1, FILEPARTS, "." );
      EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT];
      FILETYPE_MAP[EXTENSION]+=$2
    }
   END {
     for( FILETYPE in FILETYPE_MAP ) {
       print FILETYPE_MAP[FILETYPE], FILETYPE;
      }
   }' | sort -n

Upvotes: 0

ppuschmann
ppuschmann

Reputation: 235

macOS

  • use the tool du and the parameter -I to exclude all other files

Linux

-X, --exclude-from=FILE
              exclude files that match any pattern in FILE

--exclude=PATTERN
              exclude files that match PATTERN

Upvotes: 5

Timtico
Timtico

Reputation: 387

my solution to get a total size of all text files in a given path and subdirectories (using perl oneliner)

find /path -iname '*.txt' | perl -lane '$sum += -s $_; END {print $sum}'

Upvotes: 2

boes
boes

Reputation: 2855

I like to use find in combination with xargs:

find . -name "*.txt" -print0 |xargs -0 du -ch

Add tail if you only want to see the grand total

find . -name "*.txt" -print0 |xargs -0 du -ch | tail -n1

Upvotes: 0

texasflood
texasflood

Reputation: 1633

A one liner for those with GNU tools on bash:

for i in $(find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u); do echo "$i"": ""$(du -hac **/*."$i" | tail -n1 | awk '{print $1;}')"; done | sort -h -k 2 -r

You must have extglob enabled:

shopt -s extglob

If you want dot files to work, you must run

shopt -s dotglob

Sample output:

d: 3.0G
swp: 1.3G
mp4: 626M
txt: 263M
pdf: 238M
ogv: 115M
i: 76M
pkl: 65M
pptx: 56M
mat: 50M
png: 29M
eps: 25M

etc

Upvotes: 2

Barn
Barn

Reputation: 954

This will report disk space usage in bytes by extension:

find . -type f -printf "%f %s\n" |
  awk '{
      PARTSCOUNT=split( $1, FILEPARTS, "." );
      EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT];
      FILETYPE_MAP[EXTENSION]+=$2
    }
   END {
     for( FILETYPE in FILETYPE_MAP ) {
       print FILETYPE_MAP[FILETYPE], FILETYPE;
      }
   }' | sort -n

Output:

3250 png
30334451 mov
57725092729 m4a
69460813270 3gp
79456825676 mp3
131208301755 mp4

Upvotes: 26

John Minkle
John Minkle

Reputation: 21

Building on ennuikiller's, this will handle spaces in names. I needed to do this and get a little report:

find -type f -name "*.wav" | grep export | ./calc_space

#!/bin/bash
# calc_space
echo SPACE USED IN MEGABYTES
echo
total=0
while read FILE
do
    du -m "$FILE"
    space=$(du -m "$FILE"| awk '{print $1}')
    let total+=space
done
echo $total

Upvotes: 2

ghostdog74
ghostdog74

Reputation: 342629

GNU find,

find /home/d -type f -name "*.txt" -printf "%s\n" | awk '{s+=$0}END{print "total: "s" bytes"}'

Upvotes: 3

Barry Kelly
Barry Kelly

Reputation: 42152

find folder1 folder2 -iname '*.txt' -print0 | du --files0-from - -c -s | tail -1

Upvotes: 46

Related Questions