user1322720
user1322720

Reputation:

Why are there different TEXT types?

As far as I understand, the only difference between the different TEXT column types in MySQL is the number of bytes necessary to store the length:

TINYTEXT    L + 1 bytes, where L < 2^8  
TEXT        L + 2 bytes, where L < 2^16
MEDIUMTEXT  L + 3 bytes, where L < 2^24
LONGTEXT    L + 4 bytes, where L < 2^32

So why isn't there just one column type with the maximum length and no necessity to store the length?

TEXT        L bytes, where L < 2^32

Upvotes: 1

Views: 79

Answers (2)

axiac
axiac

Reputation: 72425

There are historical reasons for this. MySQL was designed 20 years ago, using the ideas and the hardware and software limitations of that time. 32-bit hardware and Linux were something relatively new back in 1995. The usual amount of RAM installed in a desktop computer was somewhere around 4-8 MB. The servers had 16 or 32 MB.

The database design ideas it implemented were another 20 years old (or more), when the hardware and software limitations were even stronger (the computer programs used to run in several hundred kilobytes).

Back then, there were two different types to store strings in databases: CHAR and VARCHAR. They still exist now but the differences between them blur every day. The values of CHAR columns used to have a fixed length while VARCHAR used to have variable length, not exceeding a maximum length declared on column creation. (There are other differences too, regarding the handling of whitespace characters but they are not relevant for the discussion).

There are several reasons why fixed-length and variable-length string types were created. While variable-length seems better because it doesn't waste space, the data of fixed-length string columns can be stored in the same place as the data of other fixed-length data types (integers, dates etc). Having the same length for all the records of a table is very important at the low level because it provides constant time for finding and retrieving a row in the storage, given the row number (provided by an index). Variable-length strings cannot provide this functionality, that's why they were stored in a different place, using a different way to retrieve them. A different place means an additional disk read in order to get them into memory and this takes time. Both fixed-length and variable-length string types have advantages and disadvantages.

Regarding the sizes, back in the 80s (and before that) the RDBMSes used to implement only CHAR/VARCHAR (max. 255 bytes) and TEXT. For most database systems at that time, the maximum length for a TEXT field wasn't larger than 64KiB.

MySQL implemented the data types in use by the existing database systems in 1995 and extended them.

Due to the hardware improvements and the reduction of storage cost per GB in the last 20-25 years, the technical reasons that generated the existence of CHAR and VARCHAR string types and their various size flavours became less and less important. As a consequence, MySQL unified their internal handling and storage, as you remarked in their documentation.

Upvotes: 1

Zalomon
Zalomon

Reputation: 553

For the same reason there are different number sizes (tinyint, smallint, etc.): the bigger each record is the more space it occupies, which consume hard drive space, makes searchs slower, etc., and when number when the number of records scalate this issues do as well. As a general rule of thumb is wise to always use the smallest type possible.

Upvotes: 0

Related Questions