Toekan
Toekan

Reputation: 55

Python dictionary of sets in SQL

I have a dictionary in Python where the keys are integers and the values sets of integers. Considering the potential size (millions of key-value pairs, where a set can contain from 1 to several hundreds of integers), I would like to store it in a SQL (?) database, rather than serialize it with pickle to store it and load it back in whenever I need it.

From reading around I see two potential ways to do this, both with its downsides:

In addition, once the database is complete, I will always need sets as a whole, so idea 1 seems to be faster? If I query for all in primary keys BETWEEN certain values, or LIKE certain values, to obtain my whole set in system 2, will the SQL database (sqlite) still work as a hashtable? Or will it linearly search for all values that fit my BETWEEN or LIKE search?

Overall, what's the best way to tackle this problem? Obviously, if there's a completely different 3rd way that solves my problems naturally, feel free to suggest it! (haven't found any other solution by searching around)

I'm kind of new to Python and especially to databases, so let me know if my question isn't clear. :)

Upvotes: 3

Views: 369

Answers (1)

Chippiewill
Chippiewill

Reputation: 343

You second answer is nearly what I would recommend. What I would do is have three columns:

  • Set ID
  • Key
  • Value

I would then create a composite primary key on the Set ID and Key which guarantees that the combination is unique:

CREATE TABLE something (
  set, 
  key, 
  value, 
  PRIMARY KEY (set, key)
);

You can now add a value straight into a particular set (Or update a key in a set) and select all keys in a set.

This being said, your first strategy would be more optimal for read-heavy workloads as the size of the indexes would be smaller.

will the SQL database (sqlite) still work as a hashtable?

SQL databases tend not to use hashtables. Nor do they usually do a sequential lookup. What they do is usually create an index (Which tends to be some kind of tree, e.g. a B-tree) which allows for range lookups (e.g. where you don't know exactly what keys you're looking for).

Upvotes: 1

Related Questions