Gyan Prakash Mishra
Gyan Prakash Mishra

Reputation: 35

open large gzip file (~1gb) in python

I am beginner in python and trying to learn python. I have written few line of code to open a large gzip file (size of ~ 1gb) and want to extract some content, however I am getting memory related error. could somebody please guide me how open the gzip with limited memory. I have put a part of code that is throwing error.

import os
import gzip

with gzip.open("test.gz","rb") as peak:
     for line in peak:
         file_content = line.read().decode("utf-8")             
         print(file_content)

Error: File "/software/anaconda3/lib/python3.7/gzip.py", line 276, in read return self._buffer.read(size)

Upvotes: 2

Views: 415

Answers (1)

Andrew F
Andrew F

Reputation: 2950

I am trying to recreate your issue but I am unable to. Using fallocate I create a big file, then gzip it, but hit no error in Python

$ fallocate -l 2G tempfile.img
$ gzip tempfile.img
$ ipython
>>> import gzip
>>> with gzip.open('tempfile.img.gz', 'rb') as fIn:
>>>    content = fIn.read()

If you hit an exception, it should have some name like OSError or something more specific. My guess is that you have a 32-bit installation of Python which would impose memory limits in the gigabyte range. This SO thread covers a way to check if you're running 32- or 64-bit.

If you post the name of the exception or a reproducible example, then I can update this answer.

Upvotes: 1

Related Questions