zell
zell

Reputation: 10204

Bash command to batch-process files using find and sorted by size

I am looking for the Linux command that batch-processes all files in the current directory, in the ascending order of the file sizes.

As a concrete example, my hello.py prints the file names:

print 'hello', sys.argv[1]

If my current directory has files file1, file2, and file3, with size(file1)<=size(file2)<=size(file3), then the Linux command I am looking for should output

hello, file1
hello, file2
hello, file3

For now, I use

find . -type f -exec python hello.py {} \;

But I do not see how to process files in the specific order on their sizes. Any idea? Thanks.

Upvotes: 1

Views: 556

Answers (1)

bakkal
bakkal

Reputation: 55448

Using ls

ls has an easy way to sort by size using the -S switch

for x in $(ls -S); do                    
    python hello.py $x
done

Or as a one-liner: for x in $(ls -S); do python hello.py $x; done

Or use xargs, like this: ls -1 -S | xargs -n 1 python hello.py, but careful because this breaks spaces in the filename into multiple files, more on that below*

Using find without changing hello.py

find . -type f | xargs du | sort -n | cut -f 2 | xargs python hello.py

Explanation:

  1. du annotates with the file's size
  2. sort sorts by that size column
  3. cut removes the extra size column, to keep only the second column which is the filename
  4. xargs calls hello.py on each line

Making the Python script accept pipes

# hello.py
import sys

def process(filename):
    print 'hello ', filename

if __name__ == '__main__':
    for filename in sys.stdin.readlines():
        process(filename)

Now you can pipe outputs to it, e.g. :

find . -type f | xargs du | sort -n | cut -f 2 | python hello.py

* If you need to support filenames with spaces in them, we should use 0 terminated lines, so:

find . -type f -print0 | xargs -0 du | ... 

Upvotes: 4

Related Questions