Reputation: 3432
I need to create column in mysql 5.1 that can store user's feedback. It shouldn't be too long, so I think not more 1000 characters of UTF-8.
The question is how to represent this efficiently in mysql 5.1.
For now I have:
`description` varchar NOT NULL,
But how to constrain varchar
to hold at most 1000 characters of UTF-8?
Upvotes: 1
Views: 198
Reputation: 116110
Since the size is apparently defined in bytes, ...
-correction- Field size is defined in 'character units'. It's a bit unclear what they mean by that, but I guess they mean 'code units'.
Removed the rest of the detailed explanation, since it wasn't (entirely true).
Correction. In MySQL you actually define the number of characters in the field. It is still limited to the 65535 byte boundary though. Above that, MySQL just reserves 3 bytes per character for UTF-8, which means that you cannot have UTF-8 fields of more than 21844 characters, and declaring a field als VARCHAR(21900) will just fail for that reason: " Column length too big for column 'field1' (max = 21845); use BLOB or TEXT instead: "
. The number in this message is wrong, by the way. The actual maximum size is 21844. 21845 is 1/3 of 65535, but I guess you need to subtract the two bytes for the field size header as well.
The limit of 3 bytes is weird, though. The unicode definition is designed to be able to expand with extra characters. There are already supplementary characters of 4 bytes, that actually cannot be stored in a UTF-8 varchar(1) field, or any varchar field for that matter, since MySQL just doesn't seem able to read those characters: "Incorrect string value: '\xF0\xA0\x9C\x8E' for column 'field1' at row 1"
. So I guess you would need an actual binary/blob column to be able to store these characters.
I think the documentation about this subject is pretty poor, but I've tried some things and came to this conclusion. You can see the fiddle here: http://sqlfiddle.com/#!2/4d938
To the question:
So for your specific situation, declaring the field as varchar(1000)
will do the trick, presuming you don't want people to use the supplementary characters in their feedback.
Some things to consider though:
Upvotes: 2
Reputation: 52883
From the documentation:
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
This means that you can store up to 65,535 bytes in a VARCHAR column. However, from the String Type Overview:
MySQL interprets length specifications in character column definitions in character units. (Before MySQL 4.1, column lengths were interpreted in bytes.) This applies to CHAR, VARCHAR, and the TEXT types.
So, declare your table with a UTF8 collation and set the length of the varchar to 1,000 characters and MySQL will do the work for you behind the scenes.
Upvotes: 2
Reputation: 2262
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions.
http://dev.mysql.com/doc/refman/5.0/en/char.html
Upvotes: 0