Reputation: 63
I have a zip archive whose internal structure looks like this:
file.zip
|
--- foo/
|
--- bar/
|
--- file1.txt
|
--- dir/
|
--- file2.txt
and I would like to extract the content of bar
to an output directory using python3, getting something that looks like so:
output-dir/
|
--- file1.txt
|
--- dir/
|
--- file2.txt
However, when I run the code below both bar
and it's content is being extracted to output-dir
import zipfile
archive = zipfile.ZipFile('path/to/file.zip')
for archive_item in archive.namelist():
if archive_item.startswith('bar/'):
archive.extract(archive_item, 'path/to/output-dir')
How can I tackle this problem? Thanks!
Upvotes: 2
Views: 1325
Reputation: 42272
Instead of using ZipFile.extract
, use ZipFile.open
, open
and shutil.copyfileobj
in order to put the file exactly where you want it to be, using path manipulation to create the output path you need.
archive = zipfile.ZipFile('path/to/file.zip')
PREFIX = 'bar/'
out = pathlib.Path('path/to/output-dir')
for archive_item in archive.namelist():
if archive_item.startswith(PREFIX):
# strip out the leading prefix then join to `out`, note that you
# may want to add some securing against path traversal if the zip
# file comes from an untrusted source
destpath = out.joinpath(archive_item[len(PREFIX):])
# make sure destination directory exists otherwise `open` will fail
os.makedirs(destpath.parent, exist_ok=True)
with archive.open(archive_item) as source,
open(destpath, 'wb') as dest:
shutil.copyfileobj(source, dest)
something like that.
Upvotes: 4