Rickaroo
Rickaroo

Reputation: 97

MySql - Why does ENUM take more space than expected?

I did a test to make sure I know how ENUM works and how it handles storage space... and got different results than expected.

One table with one field of type VARCHAR(100) populated with 1,000,000 rows. Each row has a value randomly selected from 1 of 6 strings, length of 100.

Then, conversion to ENUM, then back to VARCHAR(100). Here are the results (Data size).

1. Rows 1,000,000 = 99.2 MiB, VARCHAR(100)

2. Rows 1,000,000 = 6,835.9 KiB, Enum ('blah100Characters1','blah100Characters2',...,'blah100Characters6')

3. Rows 1,000,000 99.2 MiB, VARCHAR(100)

The VARCHAR(100) type reported as expected and matches the MySQL specification in the manual of ("L + 1 bytes, 0 <= L <= 255") 1,000,000 x 100 = 100,000,000 = 99.2 MiB

---Edit: Well, plus that one extra byte, but that's irrelevant to this discussion :o)

however, according to the MySQL specification of ENUM ("1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum)"), with 6 possible combinations I'd expect there to be a data requirement of 1 byte per row. 1,000,000 x 1 = 1,000,000 = 976.5 KiB

Can anyone explain to me why the converted table requires 6,835.9 KiB, which is, oddly enough, almost exactly 7 times more than expected?

Upvotes: 1

Views: 677

Answers (1)

Matthew
Matthew

Reputation: 48284

It adds up to 7 bytes (I get the same results); some of the space is padding and some is for a delete flag.

To prove that there is padding, add additional enums (or tiny integers). The size of the table does not change.

To prove that there is a delete flag, delete a row in the middle. The size of the table does not change.

According to this page, it has to do with myisam_data_pointer_size having a default value of 6 bytes (plus 1 byte for delete).

And he seems to be correct, as if I do this:

alter table foo MAX_ROWS=10;

the size of the table decreases.

Furthermore, from this "bug" report, it sounds like a deleted record is stored as a pointer to the next record. If so, that would mean the minimum space for any row would be the pointer size (6 bytes, by default) plus a delete byte. This is because if a record is deleted, the delete byte is set, and then the other 6 bytes are used to point to the next record.

If you want more info, I'd read up on the "delete link chain" of MyISAM tables (when using fixed record counts).

Upvotes: 1

Related Questions