Reputation: 12007
I have a directory structure that looks like:
main_directory/
directory1:
sub_directory1:
files:
myfile.txt
otherfile.txt
sub_directory2:
files:
myfile.txt
otherfile.txt
sub_directory3:
files:
myfile.txt
otherfile.txt
sub_directory4:
files:
myfile.txt
otherfile.txt
directory2:
sub_directory1:
files:
myfile.txt
otherfile.txt
sub_directory2:
files:
myfile.txt
otherfile.txt
sub_directory3:
files:
myfile.txt
otherfile.txt
sub_directory4:
files:
myfile.txt
otherfile.txt
I am trying to figure out (by trial and error because I'm not an expert at Linux) how to only gzip the myfile.txt
files in all the directories. Since they all have the same filename in different paths (there was no way around this), I need to be able to keep the files path in the archive as well. So the final gzipped tar file I am looking to create would have the contents:
mytar.tar.gz
main_directory/directory1/sub_directory1/files/myfile.txt
main_directory/directory1/sub_directory2/files/myfile.txt
main_directory/directory1/sub_directory3/files/myfile.txt
main_directory/directory1/sub_directory4/files/myfile.txt
main_directory/directory2/sub_directory1/files/myfile.txt
main_directory/directory3/sub_directory2/files/myfile.txt
main_directory/directory4/sub_directory3/files/myfile.txt
main_directory/directory5/sub_directory4/files/myfile.txt
Is there a simple bash
way to do this? I suppose I could write a python
script to do it, but that seems overkill.
Does anyone have any advice?
Upvotes: 4
Views: 302
Reputation: 189397
If the directory structure is indeed this regular, the wildcard
main_directory/*/*/files/myfile.txt
will match the files you want. However, if there are many files, you may need to revert to find
/ xargs
in order to avoid the "argument list too long" (ARG_MAX
) problem.
If there are files named myfile.txt
which you do not want to include because their path does not match the wildcard exactly, there are certainly ways to exclude them from find
, too; perhaps then this additional constraint should be stated in the question.
Upvotes: 0
Reputation: 23542
Assuming there are not too many files, you can do something like:
cd main_directory/..
find main_directory -name "myfile.txt" | xargs tar zcf mytar.tar.gz
In the event that there are a lot of files, you can pipe the file list into a file/stream and pass that into tar.
find main_directory -name "myfile.txt" -print0 | tar zcf myar.tar.gz --null -T -
This prints out the filenames separated by nulls (-print0
to find
) and instructs tar
to parse that correctly from stdin
; using nulls ensures that any special characters in directories are handled properly
Upvotes: 2
Reputation: 12007
This overcame this issue described in the other answer.
find main_directory/ -name "myfile.txt" | tar -czvf mytar.tar.gz -T -
Upvotes: 4
Reputation: 80931
With a new enough (4.0.0+ I believe) version of bash (and a number of other shells) the following will work:
tar -czf mytar.tar.gz main_directory/**/myfile.txt
Upvotes: 0