vgoklani
vgoklani

Reputation: 11786

python gzip "unexpected end of file" when writing a data stream

I'm writing (or more precisely, appending) a real-time data stream to an instance of python's gzip module. If the program that's writing the stream crashes, and relaunches, I would like the stream to automatically be appended to the original file. Unfortunately this fails in practice, as I get an "unexpected end of file" error that corresponds to the exact point at which the original program crashed.

What's the underlying approach for handing this situation, as I can't imagine this should be a hard problem. My approach is outlined below:

f = gzip.GzipFile( 'filename_json.txt.gz' ), mode='at', compresslevel=9)
while(something_is_true):
    f.write(stream['message'] + '\n')
f.close()

This runs continuously, but if the program crashes (or gets killed), the end-of-file operator never gets appended, and the gzip file becomes corrupt. In which case, any data appended after that point becomes unreadable.

Thanks!

Upvotes: 0

Views: 3438

Answers (1)

U2EF1
U2EF1

Reputation: 13289

with gzip.open('filename_json.txt.gz', mode='at', compresslevel=9) as f:
    while something_is_true:
        f.write(stream['message'] + '\n')

(This works for me on python 2.7.6)

But if that for some reason isn't working, you can do it the old fashioned way:

try:
    f = gzip.open('filename_json.txt.gz', mode='at', compresslevel=9)
    while something_is_true:
        f.write(stream['message'] + '\n')
finally:
    f.close()

Note that the error will still propagate with this code unless you catch the error. But the file will be closed.

Upvotes: 1

Related Questions