Reputation: 3955
A users
table has a need to record the status of each user. For example:
There are at least two options here. One is to simply have a tuple (or some other data structure) with all available status variants defined in the class module. Something like this:
user_status = ("NO_ACCOUNT",
"ACCOUNT_CREATED",
"NOT_VERIFIED",
"VERIFICATION_FAILED_SMS",
"VERIFIED")
The other is to have a database table with a one-to-many relationship to the users
table:
id status
0 "NO_ACCOUNT"
1 "ACCOUNT_CREATED"
2 "NOT_VERIFIED"
3 "VERIFICATION_FAILED_SMS"
4 "VERIFIED"
Now, what's interesting here is that in both cases the users
table's status
field will store an integer corresponding to the predefined status encoded either into the data structure or database table. In either solution the users
table would be indistinguishable from the other.
The database version would ensure the status value exists within the lookup table. The code version could, in theory, assign any value, however, in practical terms, each status evaluation and change to the users
table would happen through a class method that, barring coding mistakes, would make it very difficult to make mistakes.
For the most part I see both solutions as equivalent.
However, when it comes to writing code to deal with various status cases things get interesting. Let's say I want to take an action based on the user's status. Assuming the value has been retrieved and assigned to self.user_status
somewhere in the class, I might do this:
if self.user_status == 0:
print "NO_ACCOUNT: do something relevant"
elif self.user_status == 1:
print "ACCOUNT_CREATED: do something relevant"
...
etc
In other words, we have undesirable magic numbers in the code. Bad idea.
What I really want is something like this:
if self.user_status == NO_ACCOUNT:
print "NO_ACCOUNT: do something relevant"
elif self.user_status == ACCOUNT_CREATED:
print "ACCOUNT_CREATED: do something relevant"
...
etc
Which would mean defining a set of variables to capture the values (in C these would be #define
but no such thing in Python):
NO_ACCOUNT = 0
ACCOUNT_CREATED = 1
NOT_VERIFIED = 2
VERIFICATION_FAILED_SMS = 3
VERIFIED = 4
Now we have a clean looking set of conditionals but we are still dealing with having to maintain these variables-as-lookup-table solution every time something changes. Perhaps we add a new status code or remove an old one.
The situation while using a lookup table isn't much different from this. If we don't define the above variables we end-up with conditionals filled with magic numbers and lack any kind of meaning:
# This means nothing without a comment
# Prone to errors
if self.user_status == 0:
A dictionary would probably be a better solution rather than creating a gaggle of variables. The conditional would still have meaning:
if self.user_status == status_codes["NO_ACCOUNT"]:
And, finally, coming full circle, this dictionary could be loaded during class initialization by reading the lookup table off the database in order to establish the key-value pairs. In this case there's potential for the code to become "brittle" if the values in the lookup table are modified. If, for example, I decided to change "NO_ACCOUNT" to "ACCOUNT_MISSING" the code would break and we'd have to refactor to fix.
I understand normalization and using the DBMS to enforce integrity, etc. The part I am pondering here is where the line might be in deciding where to store lookup tables of this kind: In code? In the db? Or both and double the maintenance?
I am not looking for opinion but rather an algorithm of sorts to use in making these kinds of choices. Perhaps there a really strong answer to this question based on DB theory. All the better.
Upvotes: 0
Views: 290
Reputation: 26609
Use an enum type.
CREATE TYPE account_verification_status AS ENUM ('NO_ACCOUNT', 'ACCOUNT_CREATED', 'NOT_VERIFIED', 'VERIFICATION_FAILED_SMS', 'VERIFIED')
Upvotes: 2