Reputation: 5412
I am trying to read a zipped file in python. I want to read only files with "debug" in their names and only print lines which have BROKER_LOGON in them. It somehow does not read line by line, but prints the entire file which has BROKER_LOGON in it. Please tell me if there is a way to read line by line from a zipped file.
import os
import zipfile
import re
def main():
try:
root = zipfile.ZipFile("C:/Documents and Settings/Desktop/20110526-1708-server.zip", "r")
except:
root = "."
for name in root.namelist():
i = name.find("debug")
if i>0:
line = root.read(name).find("BROKER_LOGON")
if line >0:
print line
if __name__== "__main__":
main()
Upvotes: 3
Views: 9321
Reputation: 12479
for name in root.namelist():
if name.find("debug") >= 0:
for line in root.read(name).split("\n"):
if line.find("BROKER_LOGON") >= 0:
print line
This code reads the raw file contents with root.read(name), splits them to lines and then scans the lines.
Upvotes: 3
Reputation: 15726
Although ZipFile.read() returns the entire file, you can split it by newline characters, and then check it like so:
file_data = root.read(name)
for line in file_data.split("\r\n"):
if line.find("BROKER_LOGIN") > 0:
print line
Although it may be more memory efficient to use StringIO:
from StringIO import StringIO
stream = StringIO(root.read(name))
for line in stream:
if line.find("BROKER_LOGIN") > 0:
print line
Upvotes: 0
Reputation: 121
You can open() a file directly within zipfile
try something like this:
try:
root = zipfile.ZipFile("C:/Documents and Settings/Desktop/20110526-1708-server.zip", "r")
except:
root = "."
for name in root.namelist():
i = name.find("debug")
if i>0:
lines = root.open(name).readlines()
for line in lines:
if line.find("BROKER_LOGON") > 0:
print line
You can do anything you want with the list of lines returned from readlines().
Upvotes: 4
Reputation: 894
You need to unzip the file first, then read it line by line. If you don't unzip it, you will be reading the compressed character data (garbage.)
Upvotes: 1