Steve
Steve

Reputation: 4107

Python 3 SQL Binary Data INSERT without Conversion

I am using MySql and Python 3.6/3.8 with mysqlclient to maintain a database with a binary field. The code was constructed before my time in Python 2.7 to insert directly into field with binary data. For example, the table looks like

+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+
| Field                   | Type                                                               | Null | Key | Default    | Extra |
+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+
| license_id              | varchar(36)                                                        | NO   | PRI | NULL       |       |
| data                    | binary(32)                                                         | NO   | UNI | NULL       |       |
+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+

and the insert command looked like:

 dbconn.execute("""INSERT into datatable (
                            license_id,
                            data)
                        VALUES ("{:s}", "{:s}")
                    """.format(
                        str(self._license_id),                        
                        escape_string(self._data)))

and this would result in Python 2.7 inserting the binary data directly in the command like

 INSERT into datatable (license_id, data) VALUES ("XXX-XXX-52", "������C�...\r�x�b24�B")

but in Python 3 the data is a byte array type and cannot be inserted into a string and attempting to decode it results in:

escape_string(self._auth_hash).decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 5: invalid continuation byte

Other questions seem to indicate a viable path to fix this is to convert the binary to hex (here, for example) but is there any way to insert into a Python 3 unicode string without errors as the original code did?

It appears by removing the quotes around "{:s}" and using binascii.b2a_hex(self._data) would have the same intent, but it's not clear this will have the exact same behavior or if there will be problem. I expect this to be a recurring problem and am looking for the solution with the least side effects possible.

Upvotes: 3

Views: 1153

Answers (1)

snakecharmerb
snakecharmerb

Reputation: 55963

This can be done by letting the database connector handle the escaping automatically instead of using a combination of manual escaping and string formatting.

Given these bytes (Python 2.7 will accept the b"" form as str):

bs = b"\xec\xa6\x8fH~\x83\\\xfeoc\x92\xf9\xad\xe6'\xc6\x1a\x00j\x99$x\xd9\xda\xef\x0c\x080rm\xe6\xcd"

The code in the question and this Python 3 code will insert the same value into the data column.

conn = MySQLdb.connect(db='test')
cur = conn.cursor()
cur.execute(
        """INSERT INTO datatable (license_id, data) VALUES (%s, %s)""",
        ('python3', bs))
conn.commit()
conn.close()

Upvotes: 2

Related Questions