PDP
PDP

Reputation: 181

Python base64 encoding a list

Encoding is new to me in Python, and I am trying to understand it. Apologies if this has been asked and answered already.

I am trying to encode a Python list and decode it. When I am trying to encode a list directly, I am hitting an error like below.

>>> my_list = [1, 2, 3]
>>> encoded_list = base64.b64encode(my_list)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/base64.py", line 54, in b64encode
    encoded = binascii.b2a_base64(s)[:-1]
TypeError: b2a_base64() argument 1 must be string or buffer, not list

To fix it, I converted the list object to a string and passed it to the encode function and I was able to successfully encode it.

>>> encoded_list = base64.b64encode(str(my_list))
>>> encoded_list
'WzEsIDIsIDNd'

When I try to decode it, I get a decoded string like below.

>>> decoded_list = base64.b64decode(encoded_list)
>>> decoded_list
'[1, 2, 3]'
>>> type(decoded_list)
<type 'str'>

But my original intention was to encode and decode a list and not convert the list to a string and then string to list.

Pretty sure this is not the right way to encode objects like dict or a list. If that's the case, Can someone please enlighten me on how to encode/decode non string objects in Python?

Thanks very much.

Upvotes: 3

Views: 16842

Answers (3)

polarise
polarise

Reputation: 2413

You are interested in the data being encoded not the list itself being encoded. Therefore I suggest the following: use struct to pack the data.

x = range(10)
import struct
y = struct.pack('<{}i'.format(len(x)), *x)
import base64
z = base64.b64encode(y)

z will now be an encoding of the data in the list.

You can decode it back and retrieve the list as follows:

y = base64.b64decode(z)
list(struct.unpack('<{}i'.format(len(y)/4), y))

Upvotes: 1

7stud
7stud

Reputation: 48599

The error is pretty self explanatory:

b2a_base64() argument 1 must be string or buffer, not list

How about calling an encoding method that will take a list?

import pickle 

data = [ 
    1,
    "hello",
    {
        'a': [1, 2.0, 3, 4+6j],
        'b': ("character string", b"byte string"),
        'c': set([None, True, False])
    }
]

#Write encoded string to a file:
with open('data.pickle', 'wb') as f:
    pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)

#Read encoded string from file:
with open('data.pickle', 'rb') as f:
     print(f.read())  #Display the encoded string.
     f.seek(0)
     data = pickle.load(f)
     print(data)  
     print(data[2]['a'])  #Show that data is actually a python list.

--output:--
b'\x80\x04\x95\x87\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01\x8c\x05hello\x94}\x94(\x8c\x01a\x94]\x94(K\x01G@\x00\x00\x00\x00\x00\x00\x00K\x03\x8c\x08builtins\x94\x8c\x07complex\x94\x93\x94G@\x10\x00\x00\x00\x00\x00\x00G@\x18\x00\x00\x00\x00\x00\x00\x86\x94R\x94e\x8c\x01c\x94\x8f\x94(\x89\x88N\x90\x8c\x01b\x94\x8c\x10character string\x94C\x0bbyte string\x94\x86\x94ue.'

[1, 'hello', {'a': [1, 2.0, 3, (4+6j)], 'c': {False, True, None}, 'b': ('character string', b'byte string')}]

[1, 2.0, 3, (4+6j)]

And, if you want to work base64 encoding into the mix:

import pickle 
import base64

data = [ 
    1,
    "hello",
    {
        'a': [1, 2.0, 3, 4+6j],
        'b': ("character string", b"byte string"),
        'c': set([None, True, False])
    }
]

pstr = pickle.dumps(data, pickle.HIGHEST_PROTOCOL)
bstr = base64.b64encode(pstr)
print(pstr)
print(bstr)

--output:--
b'\x80\x04\x95\x87\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01\x8c\x05hello\x94}\x94(\x8c\x01b\x94\x8c\x10character string\x94C\x0bbyte string\x94\x86\x94\x8c\x01c\x94\x8f\x94(\x89\x88N\x90\x8c\x01a\x94]\x94(K\x01G@\x00\x00\x00\x00\x00\x00\x00K\x03\x8c\x08builtins\x94\x8c\x07complex\x94\x93\x94G@\x10\x00\x00\x00\x00\x00\x00G@\x18\x00\x00\x00\x00\x00\x00\x86\x94R\x94eue.'

b'gASVhwAAAAAAAABdlChLAYwFaGVsbG+UfZQojAFilIwQY2hhcmFjdGVyIHN0cmluZ5RDC2J5dGUgc3RyaW5nlIaUjAFjlI+UKImITpCMAWGUXZQoSwFHQAAAAAAAAABLA4wIYnVpbHRpbnOUjAdjb21wbGV4lJOUR0AQAAAAAAAAR0AYAAAAAAAAhpRSlGV1ZS4='

pstr = base64.b64decode(bstr)
print(pstr)
new_data = pickle.loads(pstr)
print(new_data[2]['a'][0])

--output:--
----------------(compare to previous pstr)
b'\x80\x04\x95\x87\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01\x8c\x05hello\x94}\x94(\x8c\x01b\x94\x8c\x10character string\x94C\x0bbyte string\x94\x86\x94\x8c\x01a\x94]\x94(K\x01G@\x00\x00\x00\x00\x00\x00\x00K\x03\x8c\x08builtins\x94\x8c\x07complex\x94\x93\x94G@\x10\x00\x00\x00\x00\x00\x00G@\x18\x00\x00\x00\x00\x00\x00\x86\x94R\x94e\x8c\x01c\x94\x8f\x94(\x89\x88N\x90ue.'

1

Or, you can use eval(), gulp, to evaluate a string:

mystr = '''
[ 
    1,
    "hello",
    {
        'a': [1, 2.0, 3, 4+6j],
        'b': ("character string", b"byte string"),
        'c': set([None, True, False])
    }
]
'''

mylist = eval(mystr)
print(mylist[0])

--output:--
1

So, you could stringify your list, base64 encode the string, then base64 unencode the string, then eval the string to get the original list back. Because eval can execute arbitrary code in a string, like a command to delete your hard drive, you don't want to eval untrusted strings. Although, the docs for the pickle module contain similar warnings.

Upvotes: 1

SCB
SCB

Reputation: 6149

Try encoding/decoding using JSON instead of string.

import json
import base64

my_list = [1, 2, 3]
json_encoded_list = json.dumps(my_list)
#: '[1, 2, 3]'
b64_encoded_list = base64.b64encode(json_encoded_list)
#: 'WzEsIDIsIDNd'
decoded_list = base64.b64decode(b64_encoded_list)
#: '[1, 2, 3]'
my_list_again = json.loads(decoded_list)
#: [1, 2, 3]

But in practice, for pretty much any storage reasons I can think of there's no real reason to base64 encode your json output. Just encode and decode to json.

my_list = [1, 2, 3]
json_encoded_list = json.dumps(my_list)
#: '[1, 2, 3]'
my_list_again = json.loads(json_encoded_list)
#: [1, 2, 3]

If you need anything more complicated than Arrays and Dictionaries, then probably go with 7stud's pickle method. However JSON is simple, readable, widely supported and cross-compatible with other languages. I'd choose it whenever possible.

Upvotes: 7

Related Questions