Andrew Johns
Andrew Johns

Reputation: 705

database design for tagging multiple sources (MySQL)

I'm working on a project where I have the following (edited) table structures: (MySQL)

Blog
    id
    title
    description

Episode
    id
    title
    description

Tag
    id 
    text 

The idea is that that tags can be applied to any Blog or Episode (and to other types of sources), new tags can be created by the user if it doesn't exist already in the tag table.

The purpose of the tags is that a user will be able to search the site, and the results will search across all types of material on the site. Also, at the bottom of each blog article/episode description it would have a list of tags for that item.

I'd thought too much about the search mechanism, but I guess it'd be flexible between an OR and AND searches, if that has any impact on choices, and probably allow the user to filter the results for particular types of sources.

Originally I was planning to create multiple tag mapping tables:

BlogTag
    id
    tag_id
    blog_id

EpisodeTag
    id
    episode_id
    tag_id

But now I wonder if I would be better off with:

TaggedStuff
    id
    source_type
    source_id
    tag_id

Where source_type would be an integer related to whether it was an Episode, Blog, or some other type that I've not included in the structures above, and source_id would be the reference in that particular table.

I'm just wondering what the optimum structure would be for this, the first choice or the second?

Upvotes: 1

Views: 1384

Answers (3)

Fabian Barney
Fabian Barney

Reputation: 14549

In a clean (academic) design you would often see to have a supertype Resource (or something similar) for Blog and Episode with it's own table. Another table for the tags. And since it's a N:M relationship between Tag and Resource you have an extra mapping table between them.

So in such a design you would associate the Tag-Entities with your resources by having a relationship to their generalization.

simplified ER-Diagram

After that you can put general attributes to the generalization. (i.e. title, description) You can add attributes to the relationship between Tag and Resource like a counter how often a specific resource was tagged with a specific tag. Or how often a tag was used and and and (i.e. something like you see on stackoverflow in the upper right here)

Upvotes: 3

Dmitri Gudkov
Dmitri Gudkov

Reputation: 2123

If I understand you correctly, the point is to optimize search mechanism... So it has sense to make some kind of index_table and demoralize the data there...

I mean smth like this: Url, Type, Title, Search_Field etc.. where Url is the path to the article or episode, Type (article|episode), Name (what users will see), Search_Field ( list of tags, other important data for search )

thats why both variants are quite good)))

Upvotes: 0

Ozair Kafray
Ozair Kafray

Reputation: 13539

The biggest loss in going with structure 2 is loss of referential integrity. If you can say "whatever" to that, it might be easier to go with this structure.

When I say structure 2 I mean:

TaggedStuff

id
source_type
source_id
tag_id

Upvotes: 1

Related Questions