Reputation: 68660

Sort images by dimensions

I have several images with ID-sequence.jpg name where ID is same for a group of images, for example:

4fd-00027-1.jpg
4fd-00027-2.jpg
4fd-00027-3.jpg
6gq-00017-1.jpg
6gq-00017-2.jpg
6gq-00752-3.jpg
6gq-00752-4.jpg

.. and I need to move top 3 largest (by dimensions) images, but I can't quite figure how:

for file in 'ls -v *.jpg';
do
  IFS=: read -r width height < <(identify -ping -format '%w:%h' "$file")
  # how to compare each for dimensons?
  dir="/Users/eazzy/images_organized/${file%-*}"
  [ -d "$dir" ] || mkdir "$dir"
  echo moving "$file" to "${file%-*}"
  mv "/Users/eazzy/images_trimmed/$file" "$dir"
done

Upvotes: 2

Answers (3)

Ruslan Osmanov

Reputation: 21492

Sorting by the Hypotenuse

You can calculate the hypotenuse with an FX expression:

identify -format '%[fx:hypot(w,h)] : %f\n' *.jpg

where w and h stand for width and height correspondingly; %f stands for filename (see format and print image properties).

Sample output

1280 : gentoo_matrix.jpg
738.756 : LA-Woman-048.jpg
2812.64 : passport-photo.jpg
1835.76 : spring_makeup-wallpaper-1600x900.jpg
1196.22 : woman_painting_study_by_warnerator-d4z4s6u.jpg

The next steps are trivial. Just sort in reverse human-numeric order and process the files in a loop:

identify -format '%[fx:hypot(w,h)] : %f\n' *.jpg | \
  sort -h -r | head -3 | \
  while read line; do
    file="${line#*: *}"
    echo "$file"
  done

Sample output

passport-photo.jpg
zh220.jpg
spring_makeup-wallpaper-1600x900.jpg

(top three files by hypotenuse).

Note, large numbers require special handling, as they are printed in scientific notation (see below).

Sorting by the Area

Alternatively, you can calculate the area. The problem with the FX expressions is that big numbers are printed in scientific notation, e.g. 2.4576e+06 (2457600). You can handle this with awk's printf, for instance:

identify -format '%[fx:w*h] : %f\n' *.jpg  | \
  awk -F: '{ printf("%d :%s\n", $1, $2); }' | \
  sort -n -r  | head -3 | \
  while read line; do
    file="${line#*: *}"
    echo "$file"
  done

Note, since the numbers are in the normal decimal notation (non-scientific), we don't need human-numeric sorting here. It is safe to invoke the direct numerical sorting with sort -n.

The Case of a Large Number of Files

The *.jpg expression is expanded to a list of arguments by the shell. So if the number of images is very large, you should iterate them one-by-one, for instance:

find . -type f -regex '.*jpg$' -maxdepth 1 \
  -exec identify -format '%[fx:w*h] : %f\n' {} \; | \
  awk -F: '{ printf("%d :%s\n", $1, $2); }' | \
  sort -n -r  | head -3 | \
  while read line; do
    file="${line#*: *}"
    echo "$file"
  done

Sorting within "Groups"

At the moment of writing, it was very unclear that you actually meant taking top 3 images from a group, where "group" is a prefix like ${filename%-*}, in your terms. So the real objective was to sort by groups, then by the image dimensions within the group.

The solution can be derived from what I have written above. We only need to apply the above to the groups:

process_group()
{
  group="$1"

  find . -maxdepth 1  -type f -iname "${group}-*.jpg" \
    -exec identify -format '%[fx:w*h] : %f\n' {} \; | \
    awk -F: '{ printf("%d :%s\n", $1, $2); }' | \
    sort -n -r  | head -3 | \
    while read line; do
      file="${line#*: *}"
      echo "$file"
    done
}

find . -maxdepth 1 -type f -regex '.*jpg$' -printf "%f\n" | \
  while read file ; do
    echo "${file%-*}"
  done | sort | uniq | while read group ; do
    process_group "$group"
  done

Upvotes: 3

zeppelin

Reputation: 9355

The following will give you a list of top 3 images in an each group, as identified by the first 3 letters of the filename, sorted by their total area in pixels (i.e. width*height):

find . -maxdepth 1 -name '*.jpg' -exec identify -format "%f %[w] %[h]\n" '{}' ';' |\
 awk '// { print substr($1,0,4)" "$1" "$2*$3; }' |\
 sort -k1 -rnk3 | \
 awk '// { if(cgrp != $1) { cgrp=$1; cnt=0 } if(cnt++ < 3) { print  $1" "$2 } }'

This code implies that you have ImageMagick tools installed (should not be a problem on most *Nix systems)

(Note, if your image filenames are known to contain whitespace, you might need to modify this, to use TAB as the delimiter)

Upvotes: 0

neuhaus

Reputation: 4094

Here is a simple answer to sort them by width:

identify -ping -format '%w %f\n' *.jpg |\
sort -rn |\
head -3 |\
awk '{print $2 }'

The "identify" command will output the width followed by the filename.

The "sort" command will sort the list numerically, largest number first

The "head" command will keep the first three entries

The "awk" command will print only the second item on each line which is the filename.

Upvotes: 1