Przemek
Przemek

Reputation: 39

Products and variants - best way to design database

Description

Shop can have products (Shoes, T-Shirts, etc). Each product can have many variants where each can have different price and stock, e.g. T-Shirt has different colors and sizes

My solution for this problem is depicted in the picture

http://dl.dropbox.com/u/43925242/erd_product_variant.png

But this solution has some disadvantages

What can you suggest to simplify design of database for this problem?

===== UPDATE ======

And what about NoSQL solution? Any proposals?

Upvotes: 3

Views: 9843

Answers (3)

Neville Kuyt
Neville Kuyt

Reputation: 29629

The basic question is "how do I store heterogeneous subtypes of a parent class?". There are a number of options - the one you've chosen is a bit of a hybrid, which isn't necessarily a bad thing.

The best description I've seen of this topic is in Craig Larman's book "Applying UML and patterns" - though he writes from an object oriented, rather than database point of view.

First things first: the way you've set up "variants" may not be what you want - it suggests that "price" and "stock" move together, whereas they are very separate bits of data. I'd consider factoring them out into their own tables - "variant_price" and "variant_stock".

Secondly, the option you've chosen to represent features is commonly known as "Entity Attribute Value" or EAV. It's got the major benefit of allowing you to store data without knowing its schema at design time, but it makes any kind of boolean queries into a huge pain - imagine looking for all red tshirts in size XL.

There are 3 alternatives in the relational world (this is based on the Larman book):

  • subtype per variant. So, you create a "variant_tshirt" table with size, colour etc, and a "variant_trouser" with size, colour, inside leg etc. This keeps the tables nice and self-describing, but makes your SQL into a huge mess - it has to change for every subtype.

  • single table with all possible columns: in this case, you have a single table with all the possible fields for all subtypes. This way, your SQL stays far simpler - but the table becomes a huge mess, and you depend on your client application to "know" that trousers have an inside leg attribute, and t-shirts don't.

  • table for common attributes with subtypes storing their unique values in their own tables. In this model, assuming you've only got trousers and t=shirts, you have a "variants" table with size and colour, and a "trousers" table with inside leg.

Each option has benefits and drawbacks - especially in a situation where you don't know in advance which subtypes you're going to need, the first option is the simplest on the database end, but creates a bit of a mess for the client code.

Outside SQL, you can go for XML - using XPath, you can easily execute boolean queries, or NoSQL - but NoSQL would not be my favourite here, most of them are conceptually based on key-value relationships, which make the boolean queries rather hard.

Upvotes: 5

Purplegoldfish
Purplegoldfish

Reputation: 5284

I would suggest having a few tables, Category ie Tshirt, Jeans etc etc, Product ie Tshirt A, Tshirt B, Then a Variant table with detail such as ProductID (links to Product table), Colour, Size, Price, inStockQty.

I would also have a ProductsInCategories table with just CategoryID, ProductID so you can map a product to appear in multiple categories at once (think Unisex Tshirts that you may want to display as both mens Tshirts and Womens Tshirts)

You would have a Many to Many relationship between Category & Product using the ProductsInCategories table and a One to Many relationship between Product and Variant ( One product can have many variants but a variant can only belong to one product)

The drawbacks you mentioned can be overcome with well designed stored procedures and tables, indexing will also help performace. Also controling each variant such as updating prices etc should be fairly easy with this table structure.

Upvotes: 0

Arvo
Arvo

Reputation: 10570

Rename your Products to Categories and Variants to Products; remove Stock (and Price, unless most your variants have same base price) from Categories. This way you have multiple products, related to same category. Currently there exists ambigiuity between Products and Variants.

Product features you need to check at application level, not in database. You could link FeaturesList table (containing FeatureName, Required/Optional and CategoryID) to Category to simplify product features validation and search template generation. Or if most your categories have similar feature set, you can create NN relationship to features list; such structure is a bit harder to maintain, but much more flexible.

BTW, 'hard to control' is non-argument. If there are strict rules to validate data, then you need follow them; if there are not, then it is not your task to implement them either. Implementing huge list of strict rules is easier than attempt to invent, what two or three rules may customer want :)

Upvotes: 0

Related Questions