Reputation: 334
I extracted a layer from a docker image which archived in a file called layer.tar. I want to remove empty directories from it.
I don't want to unpack then repack files in that archive, I want to keep the original info, so I want to do it in-place.
I know how to delete files from tar but I don't know any simple method to delete empty directories in-place.
Upvotes: 0
Views: 709
Reputation: 140990
Let's create a archive t.tar with a/b/c/
and a/b/c/d/
empty directories:
mkdir -p dir
cd dir
mkdir -p a/b/c/d
mkdir -p 1/2/3/4
touch a/fil_ea a/b/file_ab # directory a/b/c and a/b/c/d are empty
touch 1/2/3/file_123 1/2/3/4/file_1234 # directories 1/2/3/4 not empty
tar cf ../t.tar a 1
cd ..
Using tar tf
and some filtering we can extract the directories and files in a tar archive. Then for each directory in tmpdirs
we can check if it has any files in tmpfiles
with a simple grep and then remove those directories using --delete
tar option:
tar tf t.tar | tee >(grep '/$' > tmpdirs) | grep -v '/$' > tmpfiles
cat tmpdirs | xargs -n1 -- sh -c 'grep -q "$1" tmpfiles || echo "$1"' -- \
| tac \
| xargs -- tar --delete -f t.tar
Not that tac is a bit unneeded, but the files where sorted alphabetically in tar, so when tar removes the directory a/b/c/
with all subdirectories first and then tries to remove a/b/c/d/
directory it fails with an Not found in archive
in error. tac
is a cheap way to fix that, so tar first removes a/b/c/d/
and then a/b/c/
.
Upvotes: 1