Herr
Herr

Reputation: 2735

Approach for multiple "item sets" in Database Design

I have a database design where i store image filenames in a table called resource_file.

CREATE TABLE `resource_file` (
  `resource_file_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `resource_id` int(11) NOT NULL,
  `filename` varchar(200) NOT NULL,
  `extension` varchar(5) NOT NULL DEFAULT '',
  `display_order` tinyint(4) NOT NULL,
  `title` varchar(255) NOT NULL,
  `description` text NOT NULL,
  `canonical_name` varchar(200) NOT NULL,
  PRIMARY KEY (`resource_file_id`)
) ENGINE=InnoDB AUTO_INCREMENT=592 DEFAULT CHARSET=utf8;

These "files" are gathered under another table called resource (which is something like an album):

CREATE TABLE `resource` (
  `resource_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `description` text NOT NULL,
  PRIMARY KEY (`resource_id`)
) ENGINE=InnoDB AUTO_INCREMENT=285 DEFAULT CHARSET=utf8;

The logic behind this design comes handy if i want to assign a certain type of "resource" (album) to a certain type of "item" (product, user, project & etc) for example:

CREATE TABLE `resource_relation` (
  `resource_relation_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `module_code` varchar(32) NOT NULL DEFAULT '',
  `resource_id` int(11) NOT NULL,
  `data_id` int(11) NOT NULL,
  PRIMARY KEY (`resource_relation_id`)
) ENGINE=InnoDB AUTO_INCREMENT=328 DEFAULT CHARSET=utf8;

This table holds the relationship of a resource to a certain type of item like:

I do exactly this by giving the "module_code" a value like, "product" or "user" and assigning the data_id to the corresponding unique_id, in this case, product_id or user_id.
So at the end of the day, if i want to query the resources assigned to a product with the id of 123 i query the resource_relation table: (very simplified pseudo query)

SELECT * FROM resource_relation WHERE data_id = 123 AND module_code = 'product'

And this gives me the resource's for which i can find the corresponding images.

Thank you

Upvotes: 0

Views: 247

Answers (3)

APC
APC

Reputation: 146209

To answer your second question first: the table resource_relation is an implementation of an Entity-attribute-value model.

So the answer to the next question is, it depends. According to relational database theory it is bad design, because we cannot enforce a foreign key relationship between data_id and say product_id, user_id, etc. It also obfuscates the data model, and it can be harder to undertake impact analysis.

On the other hand, lots of people find, as you do, that EAV is a practical solution to a particular problem, with one table instead of several. Although, if we're talking practicality, EAV doesn't scale well (at least in relational products, there are NoSQL products which do things differently).

From which it follows, the answer to your first question, is it the correct approach?, is "Strictly, no". But does it matter? Perhaps not.


" I can't see a problem why this would "not" scale. Would you mind explaining it a little bit further? "

There are two general problems with EAV.

The first is that small result sets (say DATE_ID=USER_ID) and big result sets (say DATE_ID=PRODUCT_ID) use the same query, which can lead to sub-optimal execution plans.

The second is that adding more attributes to the entity means the query needs to return more rows, whereas a relational solution would return the same number of rows, with more columns. This is the major scaling cost. It also means we end up writing horrible queries like this one.

Now, in your specific case perhaps neither of these concerns are relevant. I'm just explaining the reasons why EAV can cause problems.

"How would i be supposed to assign "resources" to for example, my product table, "the normal way"?"

The more common approach is to have a different intersection table (AKA junction table) for each relationship e.g.USER_RESOURCES, PRODUCT_RESOURCES, etc. Each table would consist of a composite primary key, e.g. (USER_ID, RESOURCE_ID), and probably not much else.

The other approach is to use a generic super-type table with specific sub-type tables. This is the implementation which Damir has modelled. The normal use caee for super-types is when we have a bunch of related entities which have some attributes, behaviours and usages in common plus seom distinct features of their own. For instance, PERSON and USER, CUSTOMER, SUPPLIER.

Regarding your scenario I don't think USER, PRODUCT and GALLERY fit this approach. Sure they are all consumers of RESOURCE, but that is pretty much all they have in common. So trying to map them to an ITEM super-type is a procrustean solution; gaining a generic ITEM_RESOURCE table is likely to be a small reward for the additiona hoops you're going to have to jump through elsewhere.

Upvotes: 1

I have a database design where i store images in a table called resource_file.

You're not storing images; you're storing filenames. The filename may or may not identify an image. You'll need to keep database and filesystem permissions in sync.

Your resource_file table structure says, "Image filenames are identifiable in the database, but are unidentifiable in the filesystem." It says that because resource_file_id is the primary key, but there are no unique constraints besides that id. I suspect your image files actually are identifiable in the filesystem, and you'd be better off with database constraints that match that reality. Maybe a unique constraint on (filename, extension).

Same idea for the resource table.

For resource_relation, you probably need a unique constraint on either (resource_id, data_id) or (resource_id, data_id, module_code). But . . .

I'll try to give this some more thought later. It's kind of hard to figure out what you're trying to do resource_relation, which is usually a red flag.

Upvotes: 0

Damir Sudarevic
Damir Sudarevic

Reputation: 22187

enter image description here

This one uses super-type/sub-type. Note how primary key propagates from a supert-type table into sub-type tables.

Upvotes: 2

Related Questions