Cmag
Cmag

Reputation: 15770

Python gzip omit the original filename and timestamp

Folks, I am generating an md5sum of a gzip file. Technically, each time its compressing the same file, but the resulting md5sum is different. How do I tell it to use the -n flag to omit the original filename and timestamp?

f_in = open(tmpFile, 'rb')
f_out = gzip.open(uploadFile, 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()

Thanks!

Upvotes: 3

Views: 1957

Answers (2)

JustAC0der
JustAC0der

Reputation: 3149

If you would like to write utf-8 text to a gz file without a filename in the header, here's a way to do this:

import gzip, io

ofile = open("./stuff.txt.gz", 'wb')
ogzfile = gzip.GzipFile('', 'w', 9, ofile, 0.)
ogztextfile = io.TextIOWrapper(ogzfile, 'utf-8')

ogztextfile.write("Зарегистрируйтесь сейчас на\nДесятую Международную\nКонференцию")

ogztextfile.close()
ogzfile.close()
ofile.close()

Upvotes: 0

Mark Adler
Mark Adler

Reputation: 112404

The GzipFile class allows you to explicitly provide the filename and the timestamp for the header.

E.g.:

#!/usr/bin/python
import sys
import gzip

f = open('out.gz', 'wb')
gz = gzip.GzipFile('', 'wb', 9, f, 0.)
gz.write(str.encode('this is a test'))
gz.close()
f.close()

This will produce a gzip header with no filename and a modification time of zero, meaning no modification time per the RFC 1952 standard for gzip.

Upvotes: 5

Related Questions