Reputation: 150
I found a solution for my question in Windows but I'm using Ubuntu: How to copy a directory structure but only include certain files using Windows batch files?
As the title says, how can I recursively copy a directory structure but only include some files? For example, given the following directory structure:
folder1
folder2
folder3
data.zip
info.txt
abc.xyz
folder4
folder5
data.zip
somefile.exe
someotherfile.dll
The files data.zip and info.txt can appear everywhere in the directory structure. How can I copy the full directory structure, but only include files named data.zip
and info.txt
(all other files should be ignored)?
The resulting directory structure should look like this:
copy_of_folder1
folder2
folder3
data.zip
info.txt
folder4
folder5
data.zip
Could you tell me a solution for Ubuntu?
Upvotes: 7
Views: 5707
Reputation: 993
tar
is a surprisingly useful tool in this space. Below I tell it to c
reate v
erbosely a f
ile to stdout (-
), taking the file list from find
running in a subshell ($( ... )
), then pipe that stream into tar
again asking it to ex
tract the file to the destination directory (-C ~/destination
).
It is assumed that the destination directory is empty. If it's not then you'll just get the files matched by the find updated, no files that are absent from ~/source
compared to ~/destination
will be removed.
To use this, start in the source directory:
bob@home:~$ cd source
bob@home:~/source$ tar cvf - $( find -name "info.txt" -o -name "data.zip" ) | tar x -C ~/destination
Here's the contents of ~/source
:
bob@home:~/source$ find
.
./file-i-dont-want
./info.txt
./data.zip
./folder1
./folder1/folder2
./folder1/folder2/another-trashy-file
./folder1/folder2/data.zip
./folder1/info.txt
Note that the -o
switch in find
means or
.
And here's the contents of the destination after the operation:
bob@home:~/destination$ find
.
./info.txt
./data.zip
./folder1
./folder1/folder2
./folder1/folder2/data.zip
./folder1/info.txt
Upvotes: 0
Reputation: 18875
Here is a one-liner using rsync:
rsync -a -f"+ info.txt" -f"+ data.zip" -f'-! */' folder1/ copy_of_folder1/
If you already have a file list, and want a more scalable solution
cat file.list | xargs -i rsync -a -f"+ {}" -f'-! */' folder1/ copy_of_folder1/
Upvotes: 1
Reputation: 1629
cp -pr folder1 copy_of_folder1; find copy_of_folder1 -type f ! \( -name data.zip -o -name info.txt \) -exec rm -f {} \;
Upvotes: 0
Reputation: 15744
$ rsync --recursive --include="data.zip" --include="*.txt" --filter="-! */" dir_1 copy_of_dir_1
To exclude dir3
regardless of where it is in the tree (even if it contains files that would match the --include
s):
--exclude 'dir3/' (before `--filter`)
To exclude dir3
only at at specific location in the tree, specify an absolute path, starting from your source dir:
--exclude '/dir1/dir2/dir3/' (before `--filter`)
To exclude dir3
only when it's in dir2
, but regardless of where dir2
is:
--exclude 'dir2/dir3/' (before `--filter`)
Wildcards can also be used in the path elements where *
means a directory with any name and **
means multiple nested directories.
To specify only files and dirs to include, run two rsync
s, one for the files and one for the dirs. The problem with getting it done in a single rsync
is that when you don't include a dir, rsync
won't enter the dir and so won't discover any files in that branch that may be matching your include filter. So, you start by copying the files you want while not creating any dirs that would be empty. Then copy any dirs that you want.
$ rsync --recursive --prune-empty-dirs --include="*.txt" --filter="-! */" dir_1 copy_of_dir_1
$ rsync --recursive --include '/dir1/dir2/' --include '/dir3/dir4/' --filter="-! */" dir_1 copy_of_dir_1
You can combine these if you don't mind that your specified dirs don't get copied if they're empty:
$ rsync --recursive --prune-empty-dirs --include="*.txt" --include '/dir1/dir2/' --include '/dir3/dir4/' --filter="-! */" dir_1 copy_of_dir_1
The --filter="-! */"
is necessary because rsync includes all files and folders that match none of the filters (imagine it as an invisible --include
filter at the end of the list of filters). rsync
checks each item to be copied against the list of filters and includes or excludes the item depending on the first match it finds. If there's no match, it hits that invisible --include
and goes on to include the item. We wanted to change this default to --exclude
, so we added an exclude filter (the -
in -! */
), then we negate the match (!
) and match all dirs (*/
). Since this is a negated match, the result is that we allow rsync
to enter all the directories (which, as I mentioned earlier, allows rsync
to find the files we want).
We use --filter
instead of --exclude
for the final filter because --exclude
does not allow specifying negated matches with the !
operator.
Upvotes: 7
Reputation: 394
I don't have a beautiful one liner, but since nobody else has answered you can always:
find . -name 'file_name.extension' -print | cpio -pavd /path/to/receiving/folder
For each specific file after copying the directories.
(Make sure you're in the original folder first, of course! :) )
Upvotes: 5