RTC222
RTC222

Reputation: 2323

Fastest way to read a large binary file with Python

I need to read a simple but large (500MB) binary file in Python 3.6. The file was created by a C program, and it contains 64-bit double precision data. I tried using struct.unpack but that's very slow for a large file.

Here is my simple file read:

def ReadBinary():

    fileName = 'C:\\File_Data\\LargeDataFile.bin'

    with open(fileName, mode='rb') as file:
        fileContent = file.read()

Now I have fileContent. What is the fastest way to decode it into 64-bit double-precision floating point, or read it without the need to do a format conversion?

I want to avoid, if possible, reading the file in chunks. I would like to read it decoded, all at once, like C does.

Upvotes: 2

Views: 911

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155323

You can use array.array('d')'s fromfile method:

def ReadBinary():
    fileName = r'C:\File_Data\LargeDataFile.bin'

    fileContent = array.array('d')
    with open(fileName, mode='rb') as file:
        fileContent.fromfile(file)
    return fileContent

That's a C-level read as raw machine values. mmap.mmap could also work by creating a memoryview of the mmap object and casting it.

Upvotes: 6

Related Questions