KillerFrca
KillerFrca

Reputation: 135

Using masks in MySQL table

I have a problem with designing table structure, and i would like to hear your input. I'm creating a table that should contain around 1-2mil rows, lets call them users, and for each users i need to store around 500 boolean values. Main emphasis with this table is to be fast to read from, there is no emphasis on size or write time. Main queries will be about selecting number of users matching various boolean columns values as true or false.

So here is my question: I have to designs in my mind.

Have anyone experience around working with such data, and speed comparision between similar methods? Or has anyone better idea how to approach this?

Upvotes: 1

Views: 981

Answers (3)

ericpap
ericpap

Reputation: 2937

Please consider a third option. Its a Little more complicated but can improve performace. You will need 3 tables:

  • Users
  • Permissions
  • PermissionsxUser

Table Users only have the information about users. Your "columns" are translated in rows of Permissions table. Then for each permisions you want to give add a row in PermissionsxUser Table. Of course each table need an ID.

It is also more efficient because you only add records when you grant the specific permission. If Users don't have a record for specific permission ID it is not allowed to do that action.

Upvotes: 2

Andrzej Reduta
Andrzej Reduta

Reputation: 767

You can create SQL structure that will emulate this 500 bollean values as for example 8 * (64 bits) BIGINT columns.

Each boolean set can be representet as integer. So You can group Your booleans as inegers.

If You want ask Your DB for boolean set You have to translate it to integer first.

So in every query You will ask for 8 bigints sets, not for 500 booleans.

It may help. Each column should have an index of course.

Example: value 2 for unsigned Integer (4 bits) will be represented as 0010 It may stands for 4 boolean columns which have values, false,false,true,false.

Upvotes: -1

Gordon Linoff
Gordon Linoff

Reputation: 1270081

You should test the two different approaches on sample data on your system. A "normal" way to store the data in SQL would allocate 1 byte per boolean value. So, a given record would be about 500 bytes (probably plus some additional overhead). By bit fiddling, you can put this into 63 bytes or so, with the additional complication of more complicated access. MySQL can readily handle 500 columns.

Under normal circumstances, the difference would not be so great. Reducing the size of the records by a factor of 8 reduces the number of pages needed and can be the difference between data fitting in memory or not fitting in memory -- and that can be a significant performance boost. (Actually, a few million records of 500 bytes each should readily fit into available cache, so this might not be a big difference after all).

On the other hand, you can't really index bits. So, if you want to improve performance for certain queries using indexes, that option would be more difficult if you pack the bits.

In general, for a handful of flags, the performance gains by backing the bits are probably not worth the added complexity on the access. However, you have a rather extreme case where bit-packing may be warranted.

Upvotes: 1

Related Questions