Reputation: 3417
I have a need to store into a MySQL table lengthy bit-strings which could be as long as 32768 bits. This data need will not need to be indexed or full-text searched at any time. If I have read correctly, this size should be well within both my max_packet_size as well as the row-size limit @ 65k.
Ideally I would like to store the strings (and INSERT them) in 0b format, but this is not a requirement...anything that will give me essentially 1:1 data/size on disk would be great.
BLOBs do not seem to do the job well enough, as a string comprised of only ones and zeroes ('010101010101') is seen no different than normal text and costs me L bytes + 2. BIT() would be perfect, but is limited only to 64 bits max length.
Though much of the data (90%+) would be sufficiently represented within an unsigned Bigint, the remaining 10% of rows entice me to find a more elegant solution than splitting them up logically (i.e., searching a secondary table if not found in the first, secondary table using BLOBs for remaining 10% rows, etc.).
An added bonus would be any type that permits bitwise operations, but if not, this is just as easily done outside the MySQL server.
What is the most efficient data type for this purpose?
Upvotes: 3
Views: 1110
Reputation: 30301
I guess the BLOB type is what you need. It can represent binary strings up to 2^16 bytes and has an overhead of 2 bytes per record (if L is the length in bytes of the value, L+2 bytes is its size on disk).
Then, if you really want to optimize, use two tables, one with BLOB and the other with TINYBLOB (strings up to 2^8 bytes, 1 byte overhead), then UNION them together in a VIEW or during SELECT.
If you want to optimize even more, use a third table with BIGINT (this will allow storing binary strings up to 58 bits, since the remaining 6 will be needed to store the length of the binary string).
Upvotes: 1
Reputation: 73226
I would say it mainly depends on your access pattern. If you can afford to read/write the whole bitstring at each access, then a varbinary(4096) will work fine and be quite compact (only 2 bytes of overhead for the whole field). In this model, one bit on application side is really represented by one bit in the data storage, and it is up to the client application to interpret it as a bitstring (performing bitwise operations, etc ...)
If you want to optimize a bit more, you can imagine a table with a bigint and a varbinary(4096):
create table dummy (
dummykey int not null,
bit1 bigint null,
bit2 varbinary(4096) null,
primary key(dummykey)
);
Only one of the two fields is not null for a given record. If bit1 is not null, then it can store 64 bits. For larger bitstrings, bit1 is null, and bit2 is used instead. The client application has to be smart enough to handle all bitwise operations (paying special attention to signed/unsigned issues with bit1).
Upvotes: 2