Reputation: 27
I have the following problem. There are two nested folders A and B. They are mostly identical, but B has a few files that A does not. (These are two mounted rootfs images). I want to create a shell script that does the following:
The goal is to import the additional data from image B afterwards on an embedded system that contains the contents of image A.
For the first step I put together the following code snippet. Note to grep "Nur" : "Nur in" = "Only in" (german):
diff -rq <A> <B>/ 2>/dev/null | grep Nur | awk '{print substr($3, 1, length($3)-1) "/" substr($4, 1, length($4)-1)}'
The result is the output of the paths relative to folder B.
I have no idea how to implement the second step. Can someone give me some help?
Upvotes: 1
Views: 128
Reputation: 189387
Using diff
for finding files which don't exist is severe overkill; you are doing a lot of calculations to compare the contents of the files, where clearly all you care about is whether a file name exists or not.
Maybe try this instead.
tar zcf newfiles.tar.gz $(comm -13 <(cd A && find . -type f | sort) <(cd B && find . -type f | sort) | sed 's/^\./B/')
The find
commands produce a listing of the file name hierarchies; comm -13
extracts the elements which are unique to the second input file (which here isn't really a file at all; we are using the shell's process substitution facility to provide the input) and the sed
command adds the path into B back to the beginning.
Passing a command substitution $(...)
as the argument to tar
is problematic; if there are a lot of file names, you will run into "command line too long", and if your file names contain whitespace or other irregularities in them, the shell will mess them up. The standard solution is to use xargs
but using xargs tar cf
will overwrite the output file if xargs
ends up calling tar
more than once; though perhaps your tar
has an option to read the file names from standard input.
Upvotes: 2
Reputation: 29040
With find
:
$ mkdir -p A B
$ touch A/a A/b
$ touch B/a B/b B/c B/d
$ cd B
$ find . -type f -exec sh -c '[ ! -f ../A/"$1" ]' _ {} \; -print
./c
./d
The idea is to use the exec
action with a shell script that tests the existence of the current file in the other directory. There are a few subtleties:
sh -c
is the script to execute, the second (here _
but could be anything else) corresponds to the $0
positional parameter of the script and the third ({}
) is the current file name as set by find
and passed to the script as positional parameter $1
.-print
action at the end is needed, even if it is normally the default with find
, because the use of -exec
cancels this default.Example of use to generate your tarball with GNU tar
:
$ cd B
$ find . -type f -exec sh -c '[ ! -f ../A/"$1" ]' _ {} \; -print > ../list.txt
$ tar -c -v -f ../diff.tar --files-from=../list.txt
./c
./d
Note: if you have unusual file names the --verbatim-files-from
GNU tar
option can help. Or a combination of the -print0
action of find
and the --null
option of GNU tar
.
Note: if the shell is POSIX (e.g., bash
) you can also run find
from the parent directory and get the path of the files relative from there, if you prefer:
$ mkdir -p A B
$ touch A/a A/b
$ touch B/a B/b B/c B/d
$ find B -type f -exec sh -c '[ ! -f A"${1#B}" ]' _ {} \; -print
B/c
B/d
Upvotes: 2