Reputation: 4008
I have three entities, Items, Categories, and Attributes.
An Item
can be in one or multiple Categories
, so there is N:M relation.
Item ItemCategories Categories
id name item_id category_id id name
1 alfa 1 1 1 chipset
1 2 2 interface
An Item
can have multiple Attributes
depending on the 'Categories' they are in.
For example, the items in Category
'chipset' can have as attributes: 'interface', 'memory' 'tech'.
These attributes have a set of predefined values that don't change often, but they can change.
For example: 'memory' can only be ddr2, ddr3, ddr4.
Attributes CategoryAttributes
id name values category_id attribute_id
1 memory {ddr2, ddr3, ddr4} 1 1
An Item
that is in the 'chipset' Category
has access to the Attribute
and can only have Null or the predefined value of the attribute.
I thought to use Enum
or Json
for Attribute values, but I have two other conditions:
ItemAttributes
item_id attribute_id value
1 1 {ddr2, ddr4}
1) If an Attribute appears in 2 Categories, and an Ithe is in both categories, only once an attribute can be shown.
2) I need to use the value with rank
, so if two corresponding attribute values appear for an item, the rank should be greater if it is only one, or the value doesn't exist.
3)Creating separate tables for Attributes is not an option, because the number is not fixed, and can be big.
So, I don't know exactly the best options in the database design are to constrain the values and use for order ranking.
Upvotes: 1
Views: 2461
Reputation: 10947
The problem you are describing is a typical open schema or vertical database, which is a classic use case for some kind of EAV
model.
EAV is a complex yet powerful paradigm that allows a potentially open schema while respecting the database normal forms and allows to have what you need: having a variable number of attributes depending on specific instances of the same entity.
That is what happens typically in e-commerce using relational database since different products have different attributes (i.e a lipstick has color, but maybe for a hard drive you dont care about color but about capacity) and it doesn't make sense to have one attribute table, because the number is not fixed and can be big, and for most rows, there would be a lot of NULL
values (that is the mathematical notion of a sparse matrix, that looks very ugly in a DB table)
You can take a look at Magento DB Model, a true reference in pure EAV at scale, or Wikipedia, but probably you can do that later, and for now, you just need the basics:
The basic idea is to store attributes, and their corresponding values as rows, instead of columns, in a single table.
In the simpler implementation the table has at least three columns: entity
(usually a foreign key to an entity, or entity type/category), attribute
(this can be a string, o a foreign key in more complex systems), and value
.
In my previous example, oversimplifying, we could have a table like this, that lists attribute names and its values for
Item table Attributes table
+------+--------------+ +-------------+-----------+-------------+
| id | name | | item_id | attribute | value |
+------+--------------+ +-------------+-----------+-------------+
| 1 | "hard drive" | | 2 | "color" | "red" |
+------+--------------+ +-------------+-----------+-------------+
| 2 | "lipstick" | | 2 | "price" | 10 |
+------+--------------+ +-------------+-----------+-------------+
| 1 | "capacity"| "1TB" |
+-------------+-----------+-------------+
| 1 | "price" | 200 |
+-------------+-----------+-------------+
So for every item, you can have a list of attributes.
Since your model is more complex, has a few more constraints, so we need to adapt this model.
With this, you end up with something like
Categories table List of categories ids and names
+------+--------------+
| id | name |
+------+--------------+
| 1 | "chipset" |
+------+--------------+
| 2 | "interface" |
+------+--------------+
Attributes table List of attribute ids and their name
+------+--------------+
| id | name |
+------+--------------+
| 1 | "interface" |
+------+--------------+
| 2 | "memory" |
+------+--------------+
| 3 | "tech" |
+------+--------------+
| 4 | "price" |
+------+--------------+
Category-Attribute table What category has what attributes. Note that one attribute (i.e 4) can belong to 2 categories
+--------------+--------------+
| attribute_id | category_id |
+--------------+--------------+
| 1 | 1 |
+--------------+--------------+
| 2 | 1 |
+--------------+--------------+
| 3 | 1 |
+--------------+--------------+
| 4 | 1 |
+--------------+--------------+
| 4 | 2 |
+--------------+--------------+
Value table List of possible values for every attribute
+----------+--------------+--------+
| value_id | attribute_id | value |
+-------------+-----------+--------+
| 1 | 2 | "ddr2" |
+----------+--------------+--------+
| 2 | 2 | "ddr3" |
+----------+--------------+--------+
| 3 | 2 | "ddr4" |
+----------+--------------+--------+
| 4 | 3 |"tech_1"|
+----------+--------------+--------+
| 5 | 3 |"tech_2"|
+----------+--------------+--------+
| 6 | ... | ... |
+----------+--------------+--------+
| 7 | ... | ... |
And finally, what you can imagine, the
Item-Attribute table will list one attribute value per row
+----------+--------------+-------+
| item_id | attribute_id | value |
+----------+-----------+----------+
| 1 | 2 | 1 |
+----------+--------------+-------+
| 1 | 2 | 3 |
+----------+--------------+-------+
Meaning that item 1, for attribute 2 (`memory`), has values 1 and 3 (`ddr2` and `ddr3`)
This will cover all your conditions:
SELECT * from Category-Attribute where category_id in (SELECT category_id from ItemCategories where item_id = ...)
will give you the list of eligible attributes, only one of each even if 2 categories had the same rank
, I think I dont have enough info for this query, but being this a fully normalized model, definitely, you can do a rank. You have here pretty much the full model, so surely you can figure out the query. This is very similar to the model that Magento uses. It is very powerful but of course, it can get hard to manage, but it is the best way if we want to keep the model strict and make sure that it will enforce the constraints and that will accept all the SQL functions.
For systems less strict, it is always an option to go for a NoSQL
database with much more flexible schemas.
Upvotes: 5