Mr.cysl
Mr.cysl

Reputation: 1614

How to save RAM when dealing with very large python dict?

I have a billion-level key-value pairs and need to create a look-up table for them. I currently uses the native python dict, however, it seems to be very slow when adding the pairs into the dict and consumes lots of RAM (several hundred GB). The option I need is to 1) add every pair into the dict and 2) lookup for a few million times. Are there any recommended ways I should take to meet the requirement? I have a machine with a few hundred Gigabyte memory (but not sufficient to store everything in-memory) and a good amount of CPU cores.

Upvotes: 0

Views: 70

Answers (1)

Ewan
Ewan

Reputation: 15068

If this data is not shared between machines (and if it's in memory with a dict I don't think it is) then I would recommend using a local SQLite database.

Python has an internal library for interacting with SQLite which is fast (written in C), stores data to disk (to save RAM) and is available almost everywhere.

Upvotes: 2

Related Questions