Can I treat a file as a list in python?

Question

This is kind of a question, but it's also kind of me just hoping I don't have to write a bunch of code to get behavior I want. (Plus if it already exists, it probably runs faster than what I would write anyway.) I have a number of large lists of numbers that cannot fit into memory -- at least not all at the same time. Which is fine because I only need a small portion of each list at a time, and I know how to save the lists into files and read out the part of the list I need. The problem is that my method of doing this is somewhat inefficient as it involves iterating through the file for the part I want. So, I was wondering if there happened to be some library or something out there that I'm not finding that allows me to index a file as though it were a list using the [] notation I'm familiar with. Since I'm writing the files myself, I can make the formatting of them whatever I need to, but currently my files contain nothing but the elements of the list with as a deliminator between values.

Just to recap what I'm looking for/make it more specific.

I want to use the list indexing notation (including slicing into sub-list and negative indexing) to access the contents of a list written in a file
A accessed sub-list (e.g. f[1:3]) should return as a python list object in memory
I would like to be able to assign to indices of the file (e.g. f[i] = x should write the value x to the file f in the location corresponding to index i)

To be honest, I don't expect this to exist, but you never know when you miss something in your research. So, I figured I'd ask. On a side note if this doesn't exist, is possible to overload the [] operator in python?

donkopotamus · Accepted Answer

If your data is purely numeric you could consider using numpy arrays, and storing the data in npy format. Once stored in this format, you could load the memory-mapped file as:

>>> X = np.load("some-file.npy", mmap_mode="r")
>>> X[1000:1003]
memmap([4, 5, 6])

This access will load directly from disk without requiring the loading of leading data.

Can I treat a file as a list in python?

Answers (2)

Related Questions