Reputation: 4594
So I'm not really a database developer by trade, it's just part of my job function. But I've noticed I've been favoring a particular database design scheme lately and I have this gut feeling it's frowned upon. Suppose the following:
I have a Customer
table
Id | Name | etc...
----------------------
1 | "Bob" | etc...
Business has recently been requiring that we implement these seemingly short-lived and unrelated-to-anything-other-than-customer pieces of data; thing's like "IsPartOfGroupX"
or "FriendCount"
. So to accommodate this I've implemented a table as follows (namely Customer_Attributes
):
Id | CustomerId | Name | Value | IsActive
-------------------------------------------------------------
1 | 1 | "IsPartOfGroupX" | "True" | 1
2 | 1 | "FriendCount" | "42" | 0
My two questions are:
Is this a known/practiced relational database design pattern?
It seems like it would be, because it's been super useful. The former strategy for this type of this was just adding additional columns to the Customer
table and leaving it there when we don't use it.
Is this a good or bad practice?
I feel like true DBA's would have something to say in this space. Although it works and seems novel to me, it feels kind of wrong. Essentially I'm just flipping a table on its side and turning all of the columns into rows. But I don't know! That's why I'm asking you.
Thanks in advance for any insight you may be able to provide.
Upvotes: 1
Views: 149
Reputation: 18940
Your practice is, as others have pointed out, a case of EAV. EAV is usually, but not always, a mistake. This could be one of the exceptions.
Here's what to watch out for. These "one-off attributes" are, for all intents and purposes, unmanaged by your DBMS, because they aren't defined as columns. That means that composing queries that use this data is approximately as hard as doing the same thing with data that is not in a database.
If you, or your management, expect that turning this data into useful information is going to be easy, the way it's supposed to be in a database, this is where it's going to come back to bite you.
If, however, you rarely use one-off data in standard ways, you may just get away with it.
Upvotes: 1
Reputation: 5636
This is a Bad Idea, as has already been mentioned. But the genesis of the problem comes with thinking all information in a database must be modeled. Data is modeled, not information. Information is derived from the data.
Take IsPartOfGroupX
, for instance. What is being asked is "Is this customer a member of that group?" This is a perfectly valid question. I don't see why you label this a "seemingly short-lived and unrelated-to-anything-other-than-customer" piece of data. You have customers and you have groups in your database. (If you don't have groups in your database, then the question becomes nonsensical rather than anything you have described.)
So I assume you have groups. In that case, this is a question that can be answered through a view or UDF or a combination of both. Same for the number of friends question.
The purpose of a database is to provide, directly or indirectly, answers about the data. The view and/or UDF would be a direct answer; allowing the app to perform an analysis would be an indirect answer.
You don't (can't actually) model every question asked of a database. You can provide, through views, stored procedures and functions, easy methods to get to the answers.
Upvotes: 0
Reputation: 974
about your questions.
1) Yes, but is not IMHO an advisable except for small proyects or databases in relation of the amount of changes that you will do in futuro no about the mount of data and columns in the table, for many reasons that mentions in the links of the first coment in your post.
In my personal experience, I can say that since I believe such a table, you will struggle to have defraud the 1,2,3 and many possibilities have a need to switch to the table or adjust your values so Suits for all situations, which is almost impossible to say or at best will Highly tedious.
And what about placing restrictions forget it not Can Be Done. and that takes away some robustness to your BD.
I remember that for call the table EAV values, i had to create a lost of constants the represent primary key codes, something like hardcore. image that you need to read a query with this generic table one, two or more times, you cant deduct what the f*** is doing that query until you check values in the 'generic EAV' table.
I can definitely say about your second question It depends but IN GENERAL ISN'T RECOMENDED, in your case, you only need to add two more columns , no need to use that as bad pattern.
I always try to remember to the table as an entity and its columns are attributes , the most logical way possible Regarding doing OOP , sometimes more tedious re- insert tables make , and even if you have inserts that does not include the specific fields of tables worse , but in the end , will always be a cleaner way of doing things I repeat as IMHO .
Don't forget check links in the first comment of the post
Upvotes: 0
Reputation: 524
You're not turning columns into tables
but you are manageing properties of the customer in a normalized
database scheme. Read more on this topic, since this is the essence of 'Codd and Date's book 'on database design'.
In your example however i would create three columns per property to store 'boolean', numerical and text values, as to make use of the power of the database to limit values to expected values.
Id | CustomerId | Name | BValue | Nvalue | Tvalue | IsActive
------------------------------------------------------------------------
1 | 1 | "IsPartOfGroupX" | True | null | null | 1
2 | 1 | "FriendCount" | null | 42 | null | 0
3 | 1 | "Comment" | null | null | "new" | 1
So you are looking in the right direction.
However if the data has to be 'always filled' or always used in combination with the other table you better move it to a column of the customer database.
Upvotes: 0
Reputation: 225
Adding another table is ok and I think calling it attributes is ok, However you should do 3 columns so the customer I'd only needs to appear once. Like this: ID MEMBER OF GROUP X. NUMBER OF FRIENDS. I assume column 2 would be boolean and 3 would be an integer. The way you have it is a recipe for complexity query problems.
Upvotes: 0