user5835145
user5835145

Reputation:

Filename search using Shell Script

I am trying to search for set of files with the same name between 2 directories

**dir1** (/MyFolder/sample/test1)                     
file1.txt                   
file2.txt                   
file3.txt
file4.txt

**dir2** (/MyFolder/sample/test2)
file1.txt
file4.txt

I am using the diff command in the following way

diff -sr /MyFolder/sample/test1/ /MyFolder/sample/test2/ | awk -F: '{print $1}' | grep -r ".txt"

The result is as follows:

Files /MyFolder/sample/test1/file1.txt and /MyFolder/sample/test2/file1.txt are identical
Files /MyFolder/sample/test1/file4.txt and /MyFolder/sample/test2/file4.txt are identical

The result that I am looking for is just the file name:

file1.txt
file4.txt

Any help is appreciated!!

Upvotes: 0

Views: 38

Answers (2)

loxxy
loxxy

Reputation: 13151

A little fiddling with ls & grep should work too:

ls dir1 | grep "`ls dir2`"

Or, If it's a C Shell:

ls dir1 | grep -E "`ls dir2 | tr '\n' '|'` "

As observed by radical7, the first method wouldn't work in a C shell, as the newlines get lost while passing to grep. For such cases we could use a regex, instead.

grep -E or simply egrep allows us to use a regex of the form file1.txt|file2.txt as pattern.

Also, do note that the whitespace at the end is intentional.

Upvotes: 3

radical7
radical7

Reputation: 9134

Here's a (hopefully) simple, easy to understand method using some local files:

cd /MyFolder/sample
( cd test1 ; ls -1 * ) > test1-files
( cd test2 ; ls -1 * ) > test2-files
comm -12 test1-files test2-files

The comm command will take two sorted files (which ls does in this case for us, otherwise you'd need to sort), and outputs three columns: lines exclusively in the first file, lines exclusively in the second file, and matching lines in both files. To limit the output to what you asked for, the -12 on the comm command suppresses the first two columns.

However, if you want all this done without temporary files, you can use this sequence of pipes:

(cd test1 ; ls -1 ; cd ../test2 ; ls -1) | sort | uniq -c | grep -v "1 " | awk '{ print $2; }'

If you're unfamiliar with the commands-in-parenthesis construct, it executes the files in a subshell, aggregating the output into stdin to be passed along the pipe chain.

In fact, you can nest the commands:

((cd test1 ; ls -1) ; (cd test2 ; ls -1)) | ...

Note here there isn't the cd ../test2 that there was in the original example. When the subshell exits, you're returned to the directory that you started from.

Upvotes: 0

Related Questions