Soumya
Soumya

Reputation: 5412

Reading line by line from a zipped file in python

I am trying to read a zipped file in python. I want to read only files with "debug" in their names and only print lines which have BROKER_LOGON in them. It somehow does not read line by line, but prints the entire file which has BROKER_LOGON in it. Please tell me if there is a way to read line by line from a zipped file.

import os

import zipfile

import re

def main():
try:
    root = zipfile.ZipFile("C:/Documents and Settings/Desktop/20110526-1708-server.zip", "r")
except:
    root = "."  
for name in root.namelist():
    i = name.find("debug")
    if i>0:
        line = root.read(name).find("BROKER_LOGON")
        if line >0:
            print line


if __name__== "__main__":
    main()

Upvotes: 3

Views: 9321

Answers (4)

Antti
Antti

Reputation: 12479

for name in root.namelist():
    if name.find("debug") >= 0:
        for line in root.read(name).split("\n"):
            if line.find("BROKER_LOGON") >= 0:
                print line

This code reads the raw file contents with root.read(name), splits them to lines and then scans the lines.

Upvotes: 3

Robert
Robert

Reputation: 15726

Although ZipFile.read() returns the entire file, you can split it by newline characters, and then check it like so:

file_data = root.read(name)
for line in file_data.split("\r\n"):
    if line.find("BROKER_LOGIN") > 0:
        print line

Although it may be more memory efficient to use StringIO:

from StringIO import StringIO

stream = StringIO(root.read(name))
for line in stream:
    if line.find("BROKER_LOGIN") > 0:
        print line

Upvotes: 0

lwg643
lwg643

Reputation: 121

You can open() a file directly within zipfile

try something like this:

try:
    root = zipfile.ZipFile("C:/Documents and Settings/Desktop/20110526-1708-server.zip", "r")
except:
    root = "."  
for name in root.namelist():
    i = name.find("debug")
    if i>0:
        lines = root.open(name).readlines()
        for line in lines:
            if line.find("BROKER_LOGON") > 0:
                print line

You can do anything you want with the list of lines returned from readlines().

Upvotes: 4

chisaipete
chisaipete

Reputation: 894

You need to unzip the file first, then read it line by line. If you don't unzip it, you will be reading the compressed character data (garbage.)

Upvotes: 1

Related Questions