Julio Garcia
Julio Garcia

Reputation: 1954

What is the correct strategy to normalize a database with articles and tags for the articles?

I am building a system that stores articles and tags that categorize the article. Standard stuff, similar to how this website does it. Now my question is whether I should store the tags on a separate table that just contains tags and article ids or store the tags on an extra column in the articles table. My first instinct would be to normalize the database and have two tables. The problem is that the interface with which the user administers the tags is a simple text box with all tags separated by commas. So when the user commits his changes, in order to find out which tags where added, changed or subtracted, I would need to first query the database , compare the results with the new data on a tag basis and then process the changes accordingly. A process with a huge overhead, compared with simply updating the one filed in the one row of the articles table. How would you do it or is there a third option I haven’t considered?

PD. I am stuck with a relational database for this project .

Upvotes: 0

Views: 173

Answers (2)

Yellowfog
Yellowfog

Reputation: 2343

You need to normalize the database in order to run queries such as 'find all articles with tag T'.

I don't think that there will really be that much overhead in grabbing all of the tags to compare them with the new tags, assuming that you've applied correct indexes.

Personally I wouldn't delete all the tags then insert all the new ones, because I might want to do things like audit when individual tags are entered.

If you're using SQL Server 2008 then I suggest that you look at the MERGE command.

Upvotes: 0

Macros
Macros

Reputation: 7119

If you are using a separate table, rather than trying to figure out which tags have changed each time, simply delete all for the given article ID, and then insert all of the supplied tags - this should present very little overhead.

In a tagged system the performance that would normally be most important is the retrieval of tags and / or the retrieval of the related content. Using a separate table with an indexed tag column should provide very fast lookup in a situation where an item can have any number of tags.

Upvotes: 1

Related Questions