Reputation: 22994
I have the following SQL design issue. The code below might look a little much but basically I have a table of cars and another table of attributes the car could have. This makes complete sense to me to structure a table of attributes for an object using a linking table, #CarSupportedAtttibutes. Recently I've been tasked with doing something similar but use one table that has each of the Attributes as columns making it "flat". Similar to below:
[CarId][Name][Manual Transmission][Sunroof][Automatic Transmission][AWD]
I am told doing so it will boost the speed of my queries, but its starting to turn into a nightmare. In C# I have enumerated values for each of the car's attributes, 1 = Manual Transmission, so using the non "flat" version I am able to pull off a query pretty quickly as the SQL code below shows. Since I am being pushed to making the table flat for speed the only way I can think of is to take the enumerated value and build it into the where clause, using a case statement for every 1,2,3 and selecting off a column name.
To me it just makes more sense to organize the data like below. Like what if a new attribute about a car is needed, say "HEMI Engine". Not all cars are going to have this, in fact its going to be a rare case. But The way I am told to design is to keep the table "flat", so now I would be adding a Column called "Hemi Engine" to my table, instead of adding a row in my CarAttributes, and then only adding rows for the cars that have that as true.
Below is a snippet of the way I currently see approaching this problem, as opposed to doing a "flat" table (table with mostly columns of bits).
Question: What design makes more sense? Which is more maintainable? Am I completely crazy for thinking below is a better approach, and why?
CREATE TABLE #Car
(
CarId INT,
Name VARCHAR(250)
)
INSERT INTO #Car VALUES (1, 'Fusion')
INSERT INTO #Car VALUES (2, 'Focus')
CREATE TABLE #CarAttributes
(
AttributeId INT,
Name VARCHAR(250)
)
INSERT INTO #CarAttributes VALUES (1, 'Manual Transmission')
INSERT INTO #CarAttributes VALUES (2, 'SunRoof')
SELECT * FROM #CarAttributes
CREATE TABLE #CarSupportedAttributes
(
AttributeId INT,
CarId INT
)
INSERT INTO #CarSupportedAttributes VALUES (1,2)
--Determine if A Focus has a manual transmission
SELECT * FROM #Car c
INNER JOIN #CarSupportedAttributes csa
ON csa.CarId = c.CarId
INNER JOIN #CarAttributes ca
ON ca.AttributeId = csa.AttributeId
WHERE c.Name = 'Focus'
AND ca.AttributeId = 1
Upvotes: 1
Views: 158
Reputation: 185643
Your approach is known as Entity-Attribute-Value, or EAV (yours is slightly modified, since in your model the presence of the attribute on the entity is the value, but the concept is the same).
EAV is usually considered an anti-pattern, but it can be appropriate in some cases. Basically, if either...
Then EAV can be an appropriate choice. I can't answer either of those questions for you (though I have my suspicions), but it does seem like it might be appropriate in your case.
The other option, which is likely what most 6NF proponents would suggest, would be to have a table per attribute, like CarSunroof
or CarManualTransmission
. This would solve the first issue and the requirement of changing a table's definition whenever a new attribute is added, but would not address the issue of the user being able to change it.
Upvotes: 3