Reputation: 1141
I have a compressed data file (all in a folder, then zipped). I want to read each file without unzipping. I tried several methods but nothing works for entering the folder in the zip file. How should I achieve that?
Without folder in the zip file:
with zipfile.ZipFile('data.zip') as z:
for filename in z.namelist():
data = filename.readlines()
With one folder:
with zipfile.ZipFile('data.zip') as z:
for filename in z.namelist():
if filename.endswith('/'):
# Here is what I was stucked
Upvotes: 23
Views: 48432
Reputation: 485
I got grofte's code to work. I made some minor additions: when dealing with command-line input, it's important to handle exceptions. Plus some more print statements to help make clear what's going on.
import os
import sys
import zipfile
archive = sys.argv[1] # assuming launched with `python my_script.py archive.zip`
try:
with zipfile.ZipFile(archive) as z:
for filename in z.namelist():
if not os.path.isdir(filename):
print(f'\nFile "{filename}":')
# read the file
for line in z.open(filename):
print(line.decode('utf-8'))
else:
print(f'\nDirectory "{filename}"')
except zipfile.BadZipFile:
print(f'Bad zip file: "{archive}"')
except IsADirectoryError:
print(f'Directory, not file: "{archive}"')
except FileNotFoundError:
print(f'File not found: "{archive}"')
Upvotes: 1
Reputation: 2119
I got RichS' code to work. I made some minor edits:
import os
import sys
import zipfile
archive = sys.argv[1] # assuming launched with `python my_script.py archive.zip`
with zipfile.ZipFile(archive) as z:
for filename in z.namelist():
if not os.path.isdir(filename):
# read the file
for line in z.open(filename):
print(line.decode('utf-8'))
As you can see the edits are minor. I've switched to Python 3, the ZipFile class has a capital F, and the output is converted from b-strings to unicode strings. Only decode if you are trying to unzip a text file.
PS I'm not dissing RichS at all. I just thought it would be hilarious. Both useful and a mild shitpost.
PPS You can get file from an archive with a password: ZipFile.open(name, mode='r', pwd=None, *, force_zip64=False)
or ZipFile.read(name, pwd=None)
. If you use .read
then there's no context manager so you would simply do
# read the file
print(z.read(filename).decode('utf-8'))
Upvotes: 3
Reputation: 473873
namelist()
returns a list of all items in an archive recursively.
You can check whether an item is a directory by calling os.path.isdir():
import os
import zipfile
with zipfile.ZipFile('archive.zip') as z:
for filename in z.namelist():
if not os.path.isdir(filename):
# read the file
with z.open(filename) as f:
for line in f:
print line
Hope that helps.
Upvotes: 41
Reputation: 943
I got Alec's code to work. I made some minor edits: (note, this won't work with password-protected zipfiles)
import os
import sys
import zipfile
z = zipfile.ZipFile(sys.argv[1]) # Flexibility with regard to zipfile
for filename in z.namelist():
if not os.path.isdir(filename):
# read the file
for line in z.open(filename):
print line
z.close() # Close the file after opening it
del z # Cleanup (in case there's further work after this)
Upvotes: 7