Brooklyn Knightley
Brooklyn Knightley

Reputation: 345

How to compare latest two files are identical or not with shell?

I wanna check whether the latest two files are different or not. This is my code, it does not work.

#!/bin/bash
set -x -e

function test() {
  ls ~/Downloads/* -t | head -n 2 | xargs cmp -s
  echo $?
}

test

Thanks.

Upvotes: 1

Views: 110

Answers (1)

Charles Duffy
Charles Duffy

Reputation: 295353

Assuming that you have GNU find and GNU sort:

#!/bin/bash
#      ^^^^ - not /bin/sh, which lacks <()

{
  IFS= read -r -d '' file1
  IFS= read -r -d '' file2
} < <(find ~/Downloads -type f -mindepth 1 -maxdepth 1 -printf '%T@ %p\0' | sort -r -n -z)

if cmp -s -- "$file1" "$file2"; then
  echo "Files are identical"
else
  echo "Files differ"
fi

If your operating system is MacOS X and you have GNU findutils and coreutils installed through MacPorts, homebrew or fink, you might need to replace the find with gfind and the sort with gsort to get GNU rather than BSD implementations of these tools.

Key points here:

  • find is asked to emit a stream in the form of [epoch-timestamp] [filename][NULL]. This is done because NUL is the only character that cannot exist in a pathname.
  • sort is asked to sort this stream numerically.
  • The first two items of the stream are read into shell variables.
  • Using the -- argument to cmp after options and before positional arguments ensures that filenames can never be parsed as positional arguments, even if the were to start with -.

So, why not use ls -t? Consider (as an example) what happens if you have a file created with the command touch $'hello\nworld', with a literal newline partway through its name; depending on your version of ls, it may be emitted as hello?world, hello^Mworld or hello\nworld (in any of these cases, a filename that doesn't actually exist if treated as literal and not glob-expanded), or as two lines, hello, and world. This would mess up the rest of your pipeline (and things as simple as filenames with spaces will also break xargs with default options, as will filenames with literal quotes; xargs is only truly safe when used with the argument -0 to treat content as NUL-delimited, though it's less unsafe than defaults when used with the GNU extension -d $'\n').

See also:

Upvotes: 1

Related Questions