Reputation: 17906
I want to write a (preferably python) script to modify the content of one file in a gzipped tar file. The script must run on FreeBSD 6+.
Basically, I need to:
I'll be repeating this for a lot of files.
Python's tarfile
module doesn't seem to be able to open tar files for read/write access when they're compressed, which makes a certain amount of sense. However, I can't find a way to copy the tar file with modifications, either.
Is there an easy way to do this?
Upvotes: 4
Views: 7879
Reputation: 17415
I think David Phillips already answered quite well, but here's some example code on top:
with tarfile.open(input_tar_file, 'r:gz') as input_archive:
with tarfile.open(output_tar_file, 'w:gz') as output_archive:
for name in input_archive.getnames():
info = input_archive.getmember(name)
file = input_archive.extractfile(name)
print(f'loaded {name} size {info.size}')
output_archive.addfile(info, file)
This code does a copy of the input_tar_file
to the output_tar_file
. If you want to modify things, start at the print()
call. There, you can inspect the input, discard it, modify it as you desire.
Things to keep in mind:
info.size
, the other is implicitly given by the length of the file
stream.Upvotes: 0
Reputation: 7933
I don't see an easy way to remove a single file. You can easily extract one or all, then add any files needed.
I think that the only way is:
Be sure to reset the correct format when you read it on re-creation
tarfile.USTAR_FORMAT POSIX.1-1988 (ustar) format. tarfile.GNU_FORMAT GNU tar format. tarfile.PAX_FORMAT POSIX.1-2001 (pax) format. tarfile.DEFAULT_FORMAT
http://docs.python.org/library/tarfile.html
Upvotes: 1
Reputation: 10208
Don't think of a tar file as a database that you can read/write -- it's not. A tar file is a concatenation of files. To modify a file in the middle, you need to rewrite the rest of the file. (for files of a certain size, you might be able to exploit the block padding)
What you want to do is process the tarball file by file, copying files (with modifications) into a new tarball. The Python tarfile module should make this easy to do. You should be able to retain the attributes by copying them from the old TarInfo object to the new one.
Upvotes: 6