Reputation: 608
In my application, I am storing ids in a table as char(16), calculated as hex(uuid_short()) to make it usable with memcached plugin which requires 'key' to be a char or varchar. Example value: 57F328CF000003
If I leave it to default characterset which is utf8, as per docs it will use 3x16 bytes becaues utf8 can have max 3 bytes. However for the possible values in my use case (which is 1-9 digits and A-F), 1 byte ascii character set is sufficient.
I am unsure about whether it is a good idea to change just the column or just the table to use ascii char set? Are there any performance or design implications to using a different charset to the default of schema or the table? Are there any implications to the collation at all? Currently I use default char set 'utf8' and default collation 'utf8_general_ci'.
Upvotes: 0
Views: 439
Reputation: 142528
It is certainly possible to have different CHARACTER SETs
(and/or COLLATIONs
) in different columns in the same table.
The charset for the table is just a default; it plays no other role.
In the case of hex, ip addresses, post codes, etc, it is strongly advisable to use CHARACTER SET ascii
(latin1
is almost as good).
CHAR(16)
means there are 16 characters, and it is fixed length, so the length is 16 * the longest possible character. That is 48 bytes for utf8. 32 bytes are wasted.
VARCHAR(16)
would have a 1-byte length, plus the necessary bytes for up to 16 characters, so 17 in the case of 16 hex characters.
Using ascii is a performance benefit because of making the table smaller. Do it.
UUIDs
(and MD5s, etc) have a different problem when you have millions of rows -- they are very random, thereby leading to jumping around a lot in the table. If the table is too big to be cached in RAM, performance can become terrible.
When JOINing
tables on, say, your uuid, the declaration of the uuid in both tables must have the same charset and collation.
Upvotes: 1