saeedgnu
saeedgnu

Reputation: 4376

Python SQLite: enforce UTF-8 encoding

I'm developing a cross-platform Python (3.7+) application, and I need to rely on sort order of TEXT columns in SQLite, meaning the comparison algorithm of TEXT values must be based on UTF-8 bytes. Even if the system encoding (sys.getdefaultencoding()) is not utf-8.

But in documentation of sqlite3 module I can't find an encoding option for sqlite3.connect.

And I read that the use of sys.setdefaultencoding("utf-8") is an ugly hack and highly discouraged (that's why we need to reload(sys) before calling it)

So what's the solution?

Upvotes: 3

Views: 462

Answers (1)

saeedgnu
saeedgnu

Reputation: 4376

Looking at Python's _sqlite/connection.c code, either sqlite3_open_v2 or sqlite3_open is called (depending on a compile flag). And based on sqlite doc, both of them use UTF-8 as default database encoding. I'm still not sure about the meaning of word "default" since it doesn't mention any way to override it! But I it doesn't look like that Python can open with another encoding.

#ifdef SQLITE_OPEN_URI
    Py_BEGIN_ALLOW_THREADS
    rc = sqlite3_open_v2(database, &self->db,
                         SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE |
                         (uri ? SQLITE_OPEN_URI : 0), NULL);
#else
    if (uri) {
        PyErr_SetString(pysqlite_NotSupportedError, "URIs not supported");
        return -1;
    }
    Py_BEGIN_ALLOW_THREADS
    rc = sqlite3_open(database, &self->db);
#endif

Upvotes: 1

Related Questions