arash
arash

Reputation: 967

python LMDB large DBs (Memory Limit Error)

I have a large lmdb, about 800K images. I want to just read the entries one by one. My code is very simple and looks like this:

with env.begin() as txn:
cursor = txn.cursor()
for key, value in cursor:
    print(key)

But after reading about 70000 entries it runs out of memory (~10GB). I have no idea why. I tried to do like below but it didn't work.

for r in range(0,env.stat()['entries']):
if r%10000==0:
    if r!=0:
        txn.commit()
        cur.close()
    txn=env.begin()
    cur = txn.cursor()
    print("Change Change Change "+ str(r))
    sys.stdout.flush()
    if r==0:
        cur.first()
    else:
        cur.set_range(key)
        cur.next()
key, value = cur.item()

any suggestion?

Upvotes: 2

Views: 2230

Answers (1)

gonz
gonz

Reputation: 5276

The error trace could help. I'd check out the map_size param. From the docs:

Maximum size database may grow to; used to size the memory mapping. If database grows larger than map_size, an exception will be raised and the user must close and reopen Environment. On 64-bit there is no penalty for making this huge (say 1TB). Must be <2GB on 32-bit.

This would be an example when writing:

with lmdb.open(LMDB_DIR, map_size=LMDB_MAX_SIZE) as env:
    with env.begin(write=True) as txn:
        return txn.put(mykey, value)

Upvotes: 1

Related Questions