Reputation: 4107
I am using MySql and Python 3.6/3.8 with mysqlclient to maintain a database with a binary field. The code was constructed before my time in Python 2.7 to insert directly into field with binary data. For example, the table looks like
+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+
| license_id | varchar(36) | NO | PRI | NULL | |
| data | binary(32) | NO | UNI | NULL | |
+-------------------------+--------------------------------------------------------------------+------+-----+------------+-------+
and the insert command looked like:
dbconn.execute("""INSERT into datatable (
license_id,
data)
VALUES ("{:s}", "{:s}")
""".format(
str(self._license_id),
escape_string(self._data)))
and this would result in Python 2.7 inserting the binary data directly in the command like
INSERT into datatable (license_id, data) VALUES ("XXX-XXX-52", "������C�...\r�x�b24�B")
but in Python 3 the data is a byte array type and cannot be inserted into a string and attempting to decode it results in:
escape_string(self._auth_hash).decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 5: invalid continuation byte
Other questions seem to indicate a viable path to fix this is to convert the binary to hex (here, for example) but is there any way to insert into a Python 3 unicode string without errors as the original code did?
It appears by removing the quotes around "{:s}"
and using binascii.b2a_hex(self._data)
would have the same intent, but it's not clear this will have the exact same behavior or if there will be problem. I expect this to be a recurring problem and am looking for the solution with the least side effects possible.
Upvotes: 3
Views: 1153
Reputation: 55963
This can be done by letting the database connector handle the escaping automatically instead of using a combination of manual escaping and string formatting.
Given these bytes (Python 2.7 will accept the b""
form as str
):
bs = b"\xec\xa6\x8fH~\x83\\\xfeoc\x92\xf9\xad\xe6'\xc6\x1a\x00j\x99$x\xd9\xda\xef\x0c\x080rm\xe6\xcd"
The code in the question and this Python 3 code will insert the same value into the data
column.
conn = MySQLdb.connect(db='test')
cur = conn.cursor()
cur.execute(
"""INSERT INTO datatable (license_id, data) VALUES (%s, %s)""",
('python3', bs))
conn.commit()
conn.close()
Upvotes: 2