What column type does SQLAlchemy use for "Text" on MySQL?

Question

My overall use-case is trying to determine whether I can write a somewhat database agnostic (at least supporting Postgres and MySQL) store for some large data as raw text (think ~500MB as a rough theoretical upper bound).

Based on this answer about MySQL string/text types, it looks like only the LONGTEXT column type can meet my requirements. I'm using SQLAlchemy, which claims for its Text column type that it is for variable length strings, but also that it generally maps to the database's CLOB or TEXT types. MySQL doesn't have a CLOB type (though it does have a BLOB), and its TEXT type would be insufficient for my needs.

So, What column type does SQLAlchemy use for Text on MySQL?

Bill Karwin · Accepted Answer

Looks like SQLAlchemy supports LONGTEXT:

$ python
Python 2.7.13 (default, Sep 29 2017, 15:31:18) 
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sqlalchemy.dialects.mysql import LONGTEXT
>>>

See how to use vendor-specific types here: http://docs.sqlalchemy.org/en/latest/core/type_basics.html#vendor-specific-types

For what it's worth, trying to develop a totally brand-neutral database layer is difficult, and rarely worth the effort. I worked on the Zend Framework 1.0 some years ago, and I tried to create a generic unit testing suite for all the SQL databases supported by that framework. I found that very few data types are supported in the same way across all implementations of SQL, despite them all claiming to support the ANSI/ISO SQL standard.

Ultimately, you have to develop your own class hierarchy for your data layer, and implement the code slightly differently for each database-specific adapter.

Update: I think the news is better than we think. I tried this test:

t2 = Table('t2', metadata,
      Column('id', Integer, primary_key=True),
      Column('t1', String(64000)),
      Column('t2', String(16000000)),
      Column('t3', String(4294000000)),
      Column('t4', Text)
     )

metadata.create_all(engine)

Then I checked to see what it ended up creating in the MySQL database:

mysql> show create table t2;

CREATE TABLE `t2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `t1` mediumtext,
  `t2` longtext,
  `t3` longtext,
  `t4` text,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

So it does map SQLAlchemy's generic String data type to a more-or-less appropriate MySQL data type.

It's not surprising to me that it used larger data types than we might expect. The MEDIUMTEXT supports 16MB in bytes, not in characters. Because my default character set is the multi-byte utfmb4, the max length of MEDIUMTEXT is actually much fewer than 2^24 characters. So it had to upgrade it to LONGTEXT. Of course, 2^32 characters won't fit in LONGTEXT either, but it appears SQLAlchemy assumes you mean to create a column anyway.

I still think that it's hard to do totally implementation-neutral code. For example, what if you want to use some MySQL features like table options for the storage engine, or specific data types with no generic equivalent (for example, ENUM)?

What column type does SQLAlchemy use for "Text" on MySQL?

Answers (2)

Related Questions

What column type does SQLAlchemy use for &quot;Text&quot; on MySQL?

Answers (2)

Related Questions

What column type does SQLAlchemy use for "Text" on MySQL?