apenngrace
apenngrace

Reputation: 63

In linux, how to compare two directories by filename only and get list of results that did not match

I'd like to know how to compare two directories (not recursively) only by filename (ignore extension) to get the difference. For example, if I have list A and B, I want to know what is present in A and not in B.

I am currently processing some images. In one directory I have source files with the extension .tiff and in the other directory I have processed files with the extension .png. The filenames are the same in both directories, but only the extension differs (ex. one file is named foo.tiff in directory A, and it is named foo.png in directory B).

I'm trying to find which files have not yet been processed.

Thanks!

Upvotes: 5

Views: 8399

Answers (4)

vvchik
vvchik

Reputation: 1455

if I understand you correctly you nedd following script:

#/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
folder1="/home/vagrant/1 b"
folder2="/home/vagrant/2 a"
ext1="tiff"
ext2="png"


for fullfile in ${folder1}/*.$ext1
do
        #echo "$fullfile fullfile"
        filename=$(basename "$fullfile")
        #echo "$filename file"
        extension="${filename##*.}"
        #echo "$extension ext"
        cleanfilename="${filename%.*}"
        #echo "$cleanfilename base"
        if ! [ -a "${folder2}/$cleanfilename.$ext2" ]
        then
                echo $fullfile
        fi
done
IFS=$SAVEIFS

it It shows files present in first folder but absent in second. like this:

admin$ mkdir 1
admin$ mkdir 2
admin$ touch 1/1.tiff
admin$ touch 1/2.tiff
admin$ touch 1/3.tiff
admin$ touch 2/1.png
admin$ touch 2/2.png
admin$ vim diff.sh
admin$ chmod +x diff.sh 
admin$ ./diff.sh 
/Users/admin/1/3.tiff

Upvotes: 0

John1024
John1024

Reputation: 113834

First let's create a helper function:

getfiles() { find "$1" -maxdepth 1 -type f -exec bash -c 'for f in "$@"; do basename "${f%.*}"; done' "" {} + | sort; }

If you run getfiles dirname, it will return a sorted list of files in that directory without the directory's name and without any extension. The -maxdepth 1 option means that find will not search recursively.

Now, let's compare the files directories A and B:

diff <(getfiles A) <(getfiles B)

The output is in the usual diff format. As any of diff's normal options can be used, the output format is quite flexible.

Example

Here is a sample directory A and B, each having one file that the other doesn't have:

$ ls */
A/:
bar.png  foo.png  qux.png

B/:
bar.tiff  baz.tiff  foo.tiff

The output:

$ diff <(getfiles A) <(getfiles B)
1a2
> baz
3d3
< qux

The output correctly identifies (a) that B has a baz file that is not present in A and (b) that A has a qux file that is not present in B.

Alternative Output

Suppose that we just want to do a one-sided comparison and find what files in B are not also in A. In this case, grep can be used:

$ grep -vxFf <(getfiles A) <(getfiles B)
baz

The options used here are:

  • -v tells grep to exclude matching lines

  • -x tells grep to match whole lines only

  • -F tells grep that the patterns are fixed strings, not regular expressions.

  • -f tells grep to get the list of patterns from file or, in this case, the file-like object <(getfiles A).

Example With File and Directory Names That Include Spaces

Consider these files:

$ ls */
A A/:
1 bar.png  1 foo.png  1 qux.png

B B/:
1 bar.tiff  1 baz.tiff  1 foo.tiff

The output:

$ diff <(getfiles 'A A') <(getfiles 'B B')
1a2
> 1 baz
3d3
< 1 qux

Or,

$ grep -vxFf <(getfiles 'A A') <(getfiles 'B B')
1 baz

Limitation

If any of your file names have newline characters in them, this will give incorrect results. At least for the grep form, this could be extended to the more general case.

Upvotes: 8

Arnab Nandy
Arnab Nandy

Reputation: 6702

Usage of diff command can do this for you.

diff DirA/ DirB/

Upvotes: 0

Harish Prasanna
Harish Prasanna

Reputation: 108

Hope this helps.

-q Report only whether the files differ, not the details of the differences.
-r When comparing directories, recursively compare any subdirectories found.

diff -qr /dir1 /dir2

Upvotes: 3

Related Questions