Reputation: 6029
We are developing a site at work for the xmas/new year period where individuals can buy/sell what they want its for charity basically, look at it as a smaller version of ebay.
We currently have 15 parent categories, and these parent categories have multiple sub categories which have many child categories.
We have designed the main item table which consists of Title, Description, CreatedBy, CreatedDate, Price etc
We are now moving on to the design of the item properties table for example let me explain about the vehicle category, the user can select a value for each of the following, Mileage, Make, Vehicle Type, No of Doors, Color, Fuel Type, Transmission.
We were planning on creating one main table which would consist of each property for each category so it would look something like this
PropertyId (BigInt) (PK)
ItemID (BigInt) (FK) < Links to the Main Item table
Mileage (int) (FK)
Make (int) (FK)
Type (int) (FK)
Doors (int) (FK)
Color (int) (FK)
Fuel (int) (FK)
Transmission (int) (FK)
Plus many more for each category
Each column will be of data type int and bigint, each FK will link to its respected table which will return the description that match the int value when queried, now I haven't gone any further I've taken a step back and thought this table is going to grow dramatically when we start working through all the other category properties, this table could potentially be well over 150 columns.
I will of course create the relationships between the tables which will help with performance but I'm unsure whether of not this is the right approach, when it comes to querying this table I will only be selecting the relevant fields i.e Vehicle category again I would only query the fields mentioned above.
But the thought of having a table with 150 + columns make my stomach turn (purely because I haven't worked with such size) maybe this is considered small to a DBA? I don't know but I thought I'll ask here and maybe get some reassurance and guidance in to what direction I should be going.
I did on the other hand find this
Which I have taken on bored what Rob Nicholson mentioned as well as Reed Copsey, but I wanted to get more of an input maybe from potential DBA's or maybe people that have crossed this sort of bridge and come up with a better approach.
Yours views and opinions will be highly appreciated.
Upvotes: 1
Views: 1886
Reputation: 6771
I strongly recommend you do not set it up this way. INT is a fixed length field - this means you are going to be using up a lot database space with a lot of empty values. Instead create a different Item Properties table for each category you have. You'll still have the same number of fields but they'll be spread out in different tables. It'll be MUCH easier to work with and more efficient to query.
If you really feel like you need a single Item Properties table, I'd make it abstract and forget about the foreign keys. You'll need a different way to enforce referential integrity, but it'll perform better in the long run. For example, instead of this:
CREATE TABLE ItemProperties
(
id BIGINT IDENTITY(1, 1) ,
itemId BIGINT ,
mileage INT , --auto
make INT , --auto
mpg INT , --auto
length INT , --boat
motorhp INT , --boat
motormake INT --boat
)
Do this:
CREATE TABLE ItemProperties
(
id BIGINT IDENTITY(1, 1) ,
itemId BIGINT ,
property1 INT ,
property2 INT ,
property3 INT
)
CREATE TABLE PropertyDefinition
(
id BIGINT IDENTITY(1, 1) ,
itemId INT ,
property1label VARCHAR(255) ,
property2label VARCHAR(255) ,
property3label VARCHAR(255)
)
INSERT INTO dbo.PropertyDefinition
( itemId ,
property1label ,
property2label ,
property3label
)
VALUES ( 1 , -- auto
'mileage' ,
'make' ,
'restofautoproperties'
),
( 2, --boat
,'length',
'motorhp',
'motormake'
)
Upvotes: 1
Reputation: 27474
Minor point: Why are "mileage" and "doors" keys to another table? Aren't these just integers? Are you going to have a "doors" table with doors_id and number, and with records (1,1), (2,2), (3,3), (4,4), etc? Why not just store the number and eliminate the table that maps number to themselves? If mileage is going to be the number of miles, this would require a table with tens of thousands of records, all mapping a number to itself. If the idea is that mileage is "low", "medium", or "high", than ok, different story.
First I would ask: If you are selling many different types of products, will you actually process all the detail facts about every possible product? Or will you just turn all this into text to be displayed in a description?
That is, if you are selling cars and clothes and books and cookies and toasters and a thousand other things, will the program ever process the number of doors? Will you ever produce reports broken out by number of doors? Will you provide a way for users to search for a car by number of doors?
If not, then it would be a lot simpler to just have a free form text field named "description" and let the people who populate the database type in whatever they want. Then you don't need to add fields to the database for every possible category of product, create separate data entry screens for every category of product, and a bunch of code to turn all these individual fields into formatted text. If that's the case, then throw away all the fields that are not processed, create one "description" field, and you're done.
If that's not the case, if you really will have a huge system with 100 product categories and separate search screens for each product category -- like if someone says they want to buy a car you take them to a screen where they can search by number of doors or transmission type; if someone says they want a book you take them to a screen where they search by author or genre, etc; if someone wants a toaster they go to a screen where they can search for number of slices and electronic versus electrical; if someone wants clothes there's a screen where they can search by size, material, color, etc; and so on for every different type of product you can think of ... Well, I'd say you are building a HUGE system. But okay.
There are basically three ways to do this:
Every product record has a type code. Some fields are relevant to some types and not others. Irrelevant fields are left null. Yes, 150 fields in one table is a lot. Most tables I create have maybe a dozen or so fields. There's certainly no maximum number, but if you pass 30 I'd start to question it. It gets very difficult to manage. I've created tables where some fields are not relevant to some record types, but there were a handful and reasonably obvious. As all your fields are short identifiers, I doubt this would really hurt performance: 150 integer points, probably 4 bytes each, 600 bytes, some overhead, varies with the database engine, but still you're talking hundreds of bytes, not an excessive burden on most DB engines. The bigger problem is for the programmers to keep track of it all.
Create a product record with the information common to all products, probably ID, type, description, price, maybe a few other things. Then create subtype tables with the detail for each type. So for every car there's a product record and a product-car record, for each toaster there's a product record and a product-toaster record, etc. The subtype records contain a pointer back to the product record, and when you display you have to find and read both.
Create a product table and a product-attribute table. The product-attribute table contains the name of an attribute, like "color", and the value, like "blue". Then you can have any set of attributes for any product. This sucks and I strongly recommend not doing it. It is impossible to manage. How do you control the name of an attribute? If users just type in the attribute name and value at data entry time, there are going to be inconsistencies all over the place, like one person puts "brand" and another puts "manufacturer", one puts "doors" and another puts "number of doors", etc etc.
Slightly better than 3, create an attribute table that lists the names of all attributes used anywhere with some sort of id. Create a producttype-attribute table that lists which attributes go with which product type. Create a product-attribute table that gives the values for the attributes for a given product. Like product (id, type, description, price), attribute(id, name), producttype-attribute(type,attribute_id), product-attribute(product_id, attribute_id). The producttype-attribute table is used in the code to determine which attributes are displayed on the screen for any given product type, at data entry time and also when displaying for the customers. This is way better than 3 because it keeps things consistent and controlled.
Upvotes: 1