shafty
shafty

Reputation: 605

Python/SQLite storing lists as binaries (blobs)

The official documentation for sqlite suggests storing lists as binary objects. Google led me to various suggestions. One was to use the array module (array.array('B', my_list2), but this didn't work for a non-trivial list:

my_list2 = [(23,"Bob"), (22,"Alice")]
array.array('B',my_list2)

TypeError: an integer is required

Another suggestion involved using pickle, but someone interjected claiming it was not secure. One final suggestion is creating a new table for each list variable, of which there are a few. I'm hesitant to make a complex schema though.

What should I do? How should I store my_list2, along with the other lists, in a database?

Edit

Found an elegant and tidy solution that works for simple and complex cases with minimal code:

import json
my_list2 = [(23,"Bob Baker"), (22,"Alice Adams")]
my_list2_str = json.dumps(my_list2)
print type(my_list2_str)
 <type 'str'>
list2 = json.loads(my_list2_str)
print list2, type(list2)
 [(23, u"Bob Baker"), (22, u"Alice Adams")] <type 'list'>

Upvotes: 4

Views: 4523

Answers (1)

acattle
acattle

Reputation: 3113

This question seems very similar to this earlier SO question so at first I thought that might solve your problem. But looking again at your question it seems like you did read this question since you mention two of the methods they propose. Also, since your datatype is different (list of tuples instead of list of ints) I'm going to give you a pass.

Doing some research I find a lot of code samples that use a method sqlite3.Binary() (such as here). This might be what you want but what worries me is that I can find absolutely no documentation for this function in the Sqlite3 Python Interface API. As such, I would advise against using it. I'm guessing this method was deprecated, but I can't find any clear documentation about what replaced it.

That said, if you read the Sqlite3 Python Interface API, you see it automatically converts BLOBs to python buffer objects (and buffer objects to BLOBs). So it seems to me that if you can convert your list to a buffer then you can trivially store it as a BLOB.

In my research I found that lists cannot be stored as buffers. I also found that while there are ways of converting a list into a buffer, they require lists of simple types (i.e. not tuples). Therefore, I think your best bet is to define some utility methods for converting your lists to and from strings and then convert the strings to buffers (and back when you retreive them from the database).

def myListToStr(myList):
    """This method takes a list of (int, str) tuples and converts them to a string"""

    strList = ""
    for item in myList:
        num, name = item #split the tuple

        strList += "{}:{} ".format(num, name) #append the tuple in "num:name" format with a " " delimiter

    return strList[:-1] #remove the final space (unneeded)

def strToMyList(myStr):
    """This method takes a string in the format "int:str int:str int:str..."
    and converts it to a list of (int, str) tuples"""

    myList = []
    for tup in myStr.split(" "): #for each converted tuple
        numStr, name = tup.split(":") #split the tuple

        num = int(numStr) #NOTE: this will throw an error if numStr.isdigit() is False
        myList.append(num, name)

    return myList

Now, converting to a buffer is as easy as

my_list2Buff = buffer(myListToStr(my_list2))

And back...

my_list2 = strToList(str(my_list2Buff))

Upvotes: 3

Related Questions