Reputation: 7489

Fast Reading of 10000 Binary Files?

I have 10,000 binary files, named like this:

file0.bin

file1.bin

............ ............

file10000.bin

Each of the above files contains exactly 391 float values (1564 bytes per file).

my goal is to read all of the files into a python array in the fastest way possible. If I open & close each file using a script, it takes a lot of time (about 8min!). are there any other creative ways to read these files FAST?

I am using Ubuntu Linux and would prefer a solution that can work with Python. Thanks.

Upvotes: 3

Answers (3)

tzot

Reputation: 96071

You have 10001 files (0 to 10000 inclusive) and it takes 8 minutes to run the following?

try: xrange # python 2 - 3 compatibility
except NameError: xrange= range

import array

final= array.array('f')

for file_seq in xrange(10001):
    with open("file%d.bin" % file_seq, "rb") as fp:
        final.fromfile(fp, 391)

What's the underlying filesystem? How much RAM do you have? What's your processor and its speed?

Upvotes: 0

Adam Gent

Reputation: 49095

If you want even faster make ramdisk:

# mkfs -q /dev/ram1 $(( 2 * 10000)) ## roughly the size you need
# mkdir -p /ramcache
# mount /dev/ram1 /ramcache
# df -H | grep ramcache

now concat

# cat file{1..10000}.bin >> /ramcache/concat.bin ## thanks SiegeX

Then let your script on that file

Since I haven't tested I prefixed everything with '#' so that you wouldn't have any accidents. Just remove them if you want it to work.

This is an option but I would urge you to consider looking at the comments people have posted directly under your Q You could probably get better results examining what you are doing wrong as I could not reproduce your speed problem of 8 mins.

Upvotes: 2

Jakob Bowyer

Reputation: 34708

Iterate over them and use optimise flag you might also want to parse them using pypy it compiles python via a JIT compiler allowing for a somewhat marked increase in speed.

Upvotes: 0

Fast Reading of 10000 Binary Files?

Answers (3)

Related Questions