appills
appills

Reputation: 302

When are files too large to be read as strings in Python?

I understand (in some minimal sense) that it is considered bad practice to read an entire text file as a string if you don't know the file size, or the file size is big. For example:

with open('letters.txt', 'r') as my_txt_file:
    my_txt = my_txt_file.read()

would make my_txt a string that consists of all the text in 'letters.txt'.

I'm assuming that the threshold for considering a file as being too large to read as a string depends on the specifications of one's hardware. But I was wondering if, in general, is there a certain file size limit when one should opt for reading a file line-by-line?

Upvotes: 1

Views: 1944

Answers (1)

Chris_Rands
Chris_Rands

Reputation: 41168

The theoretical limit is the maximum size of a Python string, which is determined by it's index, which is 2 ** 63 as explained here.

The practical limit will depend on the memory of your system. Clearly if holding a string in memory demands more memory than your system has then you will get a MemoryError.

In terms of good practice, that is somewhat more subjective. However, in general I consider reading the file line by line to be good practice even for small files (being memory efficient seems like good practice). Of course there may be some situations where you need the whole file in one string, but I think these are very rare.

Upvotes: 2

Related Questions