Bob Ebert
Bob Ebert

Reputation: 1411

How to encode a string in a SQL CHAR

'admin' encoded is = CHAR(97, 100, 109, 105, 110)

I would like to know if there is a module or a way to convert each letter of a string to SQL CHARs. If not, how do I convert it myself? I have access to a chart that says a=97, b=98, etc., if that helps.

Upvotes: 0

Views: 895

Answers (1)

abarnert
abarnert

Reputation: 365995

I'm not sure why you need this at all. It's not hard to get the string representation of a CHAR field holding ASCII or Unicode or whatever code points. But I'm pretty sure you don't need that, because databases already know how to compare those to strings passed in SQL, etc. Unless you're trying to, say, generate a dump that looks exactly like the ones you get from some other tool. But, assuming you do need to do this, here's how.


I think you're looking for the ord function:

Given a string representing one Unicode character, return an integer representing the Unicode code point of that character. For example, ord('a') returns the integer 97 and ord('\u2020') returns 8224. This is the inverse of chr().

This works because Python has access to that same chart that you have—in fact, to a bunch of different ones, one for each encoding it knows about. In fact, that chart is pretty much what an encoding is.

So, for example:

def encode_as_char(s):
    return 'CHAR({})'.format(', '.join(str(ord(c)) for c in s))

Or, if you just wanted a list of numbers, not a string made out of those numbers, it's even simpler:

def encode_as_char(s):
    return [ord(c) for c in s]

This is all assuming that either (a) your database is storing Unicode characters and you're using Python 3, or (b) your database is storing 8-bit characters and you're using Python 2. Otherwise, you need an encode or decode step in there as well.

For a Python 3 Unicode string to a UTF-8 database (notice that we don't need ord here, because a Python 3 bytes is actually a sequence of numbers):

def encode_as_utf8_char(s):
    return 'CHAR({})'.format(', '.join(str(c) for c in s.encode('utf-8')))

For Python 2 UTF-8 string to a Unicode database:

def encode_utf8_as_char(s):
    return 'CHAR({})'.format(', '.join(str(ord(c)) for c in s.decode('utf-8')))

Upvotes: 2

Related Questions