Martin
Martin

Reputation: 3385

numpy array - load from disk row by row - memory efficiency but fast

Is there a way to pipeline numpy array from disk, that are saved this way

np.save('data.npy',np.zeros(shape=[500,300,3])) # RGB image

and are read row by row (or column by column) in a similar way like generators work, but without the loading latency?


Detailed description

My application needs near to zero latency, but loading bigger arrays from disk can take some time (~0.02-0.1s). Even this small latency generates unpleasant results.

I have solution for this that satisfies the speed:

dictionary = {'array1': array1, ....}

with this I can immediately access the arrays, but since I am using raspberry pi Zero, my python program is limited with CPU and RAM, so if I have a lot of arrays, I would be dealing with

MemoryError

My application reads the array row by row with frequency 50hz, like this

for row in array:
    [operation with row]
    time.sleep(0.02) # in reality, whole cycle is 0.02s ( including operation time) 

I am looking for kind of generator:

def generate_rows(path):
    array = np.load(path)
    for row in array:
        yield row

This solves the problem with memory, but I guess I will lose the near zero latency (loading the array).

Therefore my question is: Is there a way to generate rows like with generator, but the first rows are ready so to say 'immediately', with near zero latency?


EDIT: Based on @Lukas Koestler and @hpaulj comments I tried memmap, but the result is suprisingly not good, because memmap crashes on Memory sooner than simply loading full arrays.

WINDOWS 10

I saved 1000 numpy arrays (shape = [500,30,3]) on the disk and tried to cached them with np.load and np.load with memmap read

import numpy as np
import os

mats = os.listdir('matrixes')
cache = []
for i in range(10):
    for n in mats:
        cache.append(np.load('matrixes\\{}'.format(n),mmap_mode='r')) # Load with memmap
        #cache.append(np.load('matrixes\\{}'.format(n))) #load without memmap

    print('{} objects stored in cache '.format((i+1)*1000))

After running both variants(with memmap and without it), these two errors occured

Memmap after storing 4000 memmaps objects:

...
  File "C:\Python27\lib\site-packages\numpy\core\memmap.py", line 264, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
WindowsError: [Error 8] Not enough memory resources are available to process this command

Simple np.load without memmap after caching 5000 np.arrays

....
File "C:\Python27\lib\site-packages\numpy\lib\format.py", line 661, in read_array
    array = numpy.fromfile(fp, dtype=dtype, count=count)
MemoryError 

Raspberry pi Zero

As was pointed out by @Alex Yu, I was testing on windows 10, switching to raspberry pi Zero,

I got above 1000 numpy arrays (took quite long) and then I got

1000 objects stored in cache
Killed

With Memmaps, i got quite quickly above 1000 memmaps, but I got different errors

File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 416, in load
    return format.open_memmap(file, mode=mmap_mode)
  File "/usr/lib/python2.7/dist-packages/numpy/lib/format.py", line 792, in open_memmap
    mode=mode, offset=offset)
  File "/usr/lib/python2.7/dist-packages/numpy/core/memmap.py", line 264, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
mmap.error: [Errno 24] Too many open files

If I am not wrong, this error happens when opening a lot of files, but not closing them.

Upvotes: 5

Views: 1843

Answers (1)

Martin
Martin

Reputation: 3385

Thanks to @Lukas Koestler and @hpaulj for directing me into using memmap

and Thanks to @Alex Yu for making the solution reality


Solution to my own question

Using

np.load(path,mmap_mode='r')

works, but is limited by the limit of opened files. On windows and Linux throws different error:

WIN

WindowsError: [Error 8] Not enough memory resources are available to process this command

LIN

mmap.error: [Errno 24] Too many open files

This was solved with the link given by @Alex Yu extend limit of opened files.

Extract:

open

/etc/security/limits.conf

Paste following towards end:

*         hard    nofile      500000
*         soft    nofile      500000
root      hard    nofile      500000
root      soft    nofile      500000

End of Extract

There is still limitation, but it increased the amount up to 8000 objects in list

...
8000 objects stored in cache

until

Traceback (most recent call last):
...
mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
mmap.error: [Errno 12] Cannot allocate memory

For me this is quite enough


General Overview of different attitudes toward my problem

tested on arrays with shape [500,30,3]

1) Simple Load: Without caching

array = np.load(path)
[process rows]

Slowest but most memory efficient

cache_limit = 0 (Arrays in dictionary)

2) Hard cache - loading arrays into dictionary

cache_raw = {i: np.load(i) for i in os.listdir('array_folder')}
...
temporary_array = cache_raw[some_array]
[process rows with temporary_array]

Ultra fast but very memory inefficient

cache_limit ~ 1000, (RPI zero) (Arrays in dictionary)

3) Memmap Cache

cache_memmap = {i: np.load(i,mmap_mode='r') for i in os.listdir('array_folder')}
...
memmap_array = cache_memmap[some_array]
[process rows with memmap_array]

reasonable speed, memory effiecient

cache_limit ~ 8000 (RPI zero) (Arrays in dictionary)


Results

Timing results of loading first row for 20 random accesses for all attitudes:

Memmap
0.00122714042664
0.00237703323364
0.00152182579041
0.000735998153687
0.000724077224731
0.000736951828003
0.000741004943848
0.000698089599609
0.000723123550415
0.000734090805054
0.000775814056396
0.00343084335327
0.000797033309937
0.000717878341675
0.000727891921997
0.000733852386475
0.000690937042236
0.00178194046021
0.000714063644409
0.000691175460815
Hard cache
0.000302076339722
0.000305891036987
0.000910043716431
0.000320911407471
0.000298976898193
0.000309944152832
0.000294923782349
0.000304937362671
0.000298023223877
0.00031590461731
0.000324010848999
0.000273942947388
0.000274181365967
0.000286817550659
0.000277042388916
0.000297784805298
0.000288009643555
0.000318050384521
0.00031304359436
0.000298023223877
Without cache
0.0350978374481
0.0103611946106
0.0172200202942
0.0349309444427
0.0177171230316
0.00722813606262
0.0286860466003
0.0435371398926
0.0261130332947
0.0302798748016
0.0361919403076
0.0286440849304
0.0175659656525
0.035896062851
0.0307757854462
0.0364079475403
0.0258250236511
0.00768494606018
0.025671005249
0.0261180400848

EDIT: Additional computation:

Average time of 100 unique accesses. 5x times for each attitude

Memmap
0.000535547733307 # very good speed
0.000488042831421
0.000483453273773
0.000485241413116
0.00049720287323
Hard cache
0.000133073329926 # 4x faster than memmap
0.000132908821106
0.000131068229675
0.000130603313446
0.000126478672028
Without cache
0.0996991252899 # very slow
0.0946901941299
0.0264434242249 # Interesting to note here, something I suspected
0.0239776492119 # np.load has cache in itself
0.0208633708954 # If you load particular numpy array more times in the program,
#it will load faster. Kind of integrated cache
# From my own experience, it is very unreliable and cannot be counted with.

Upvotes: 2

Related Questions