Reputation: 2561
This is a design question involving both Java and MySQL.
The client requires the addition of 14 boolean flags (T/F) to keep track of some new information in an existing class/table.
I can add these flags to the existing table, or I could create a new class and table just for this data. Adding the 14 boolean flags to the existing table will give it quite a few attributes, which I'm inclined to avoid (especially if the number of flags increases in time). Creating a new class/table is cleaner, but it it really necessary in this case?
Alternately, I could use a 16 bit integer with masks to multiplex the data and then I'm only adding one variable to the existing class/table.
My primary question is this: is it more efficient to store 14 individual boolean variables in a MySQL database and load them into the class, or would it be better to store a single integer and then (in Java) multiplex the flags using bit manipulation (i.e. masks)?
Secondary question, if individual flags are more efficient, then is it better to have lots of attributes in one table or split them? What is the penalty for storing lots of boolean flags in a table that already has quite a few entities?
If the primary question's answer is "integer + multiplex" then the second question becomes moot.
Thanks.
-R
Upvotes: 1
Views: 1361
Reputation:
I personally like to have separate columns. the only place I might consider masking is when the database and the application are running under extreme conditions or on low memory and storage devices where any use of memory or space is crucial.
1- space should not be a consideration unless the class/table can grow to huge volumes. to simulate Boolean flags a tiny int (1) is enough and all you need is 0/1 values.
2- it becomes much harder for anyone wanting to do queries on the table or wanting to write reports using it. and if your client does access the database, I am quite sure masking won't be acceptable in most cases.
3- it will be much harder to build indexes on this column when they are needed, if that will be possible at all (based on the database)
4- working more and writing more code should not be an issue. You work more now but you will work less in the future. thinking it is less work for the programmer/dba is just an illusion IMHO. here are some considerations:
a- it will be harder to maintain the code and write database queries. maybe you do everything in your java code now but you never know what the future holds.
b- making structural changes become harder. what if the customer requires removal of two flags and addition of 4 ? do you keep the original two bits that held the removed flags in the database and add 4 bits ? or you use them for two of the new flags and then add two more bits? how would this affect code that is already written ? and how easy would it be to track all places and actually making the changes in the code?
in a small application, this is not a big problem. but applications grow with time. if the table gets to be widely used, this is very dangerous. if you had code working with the 7th and 8th flag, and they were removed and the decision was (by some other programmer lets say) to reuse the same places, any code that used to access the 7th and 8th bit will keep functioning (incorrectly) until that is noticed. it could already do harmful things until the issue is spotted and fixed. if you had separate columns and you dropped them, the error will pop up to the surface on the very first use of that code as the columns won't be there.
c- it will without a doubt be harder to make scripts that upgrade the data and/or change structure for the dba. an experienced dba will not sit and write the column names one after the other and will use its tools to generate scripts. with bit manipulation, he will have to work by hand and make no mistake in the expressions he produces in various selects/updates
5- all the above is database related. once it reaches your application, you are free. you can read the 16 flags from the database and produce your integer and from now on, your code can use bit manipulation on it and you can save time (by writing your functions that deal with it once and using them). I personally think that here too its better not to do so but anyway its your choice.
I know i am not focused and that i might have repeated here and there. But I also hope that i was able to help you in seeing longer term considerations that will help you make the right choice for your case.
Upvotes: 3
Reputation: 30865
In the primary question you ask that what is more efficient then what is better. This complicate the answer.
From point of view of Developer and DBA having a single column is more efficient solution. Because you spare place and using masks you increase the performance of inserts and updates.
From point of view data analyst the separate column is more efficient solution, each column has specified role.
As goes fro me i prefer the masks - Les changes in code - Better management (limited integer capacity is a risk here)
Upvotes: 0