Ross
Ross

Reputation: 14415

Storing Tags in Database. Store tag once or many times?

I would like a bit more clarification on the toxi method of storing tags in a database – mentioned elsewhere on SO.

The database schema is:

Table: Item
Columns: ItemID, Title, Content

Table: Tag
Columns: TagID, Title

Table: ItemTag
Columns: ItemID, TagID

This is probably a stupid question (but I don't know the answer)... Should each entry, in the Table Tag, have a unique title. i.e. I only store a tag once or store it every time I use it?

To illustrate, which of the two tables below should I end up with:

TagID     Title
1         Fish
2         Cat
3         Dog

or

TagID     Title
1         Fish
2         Fish
3         Cat
4         Fish
5         Dog
6         Cat

If using the first table, before entering the tag I would first have to run and sql statement to find out if it exists, correct?

Any help will be appreciated. I had my fingers burn recently due to hacking together and indexing, want to start getting the basics correct.

Upvotes: 8

Views: 3379

Answers (3)

Eimantas
Eimantas

Reputation: 49344

The basics are that you need to store tags like you're showing in first case. It's good for checking if tag exists (since in second case for existing tags your db would return as many rows as there are those appearances of tag) and good for retrieving items by tag (select item id by one tag id is better than selecting item ids by a set of tag_id which has same representational meaning).

If you had burnt your fingers because of indexing - you should always check how query is being executed (for mysql it's EXPLAIN/DESCRIBE SELECT).

Upvotes: 4

Martin v. Löwis
Martin v. Löwis

Reputation: 127447

You should have the tags only once in the tag table; the whole point of the ItemTag table is to provide you with an n:m association (each item having multiple tags, and each tag belonging to multiple items).

If you would repeat the tag titles, you could simplify the structure by having the Tag table use ItemIDs right away, not tag IDs.

Upvotes: 1

Pascal MARTIN
Pascal MARTIN

Reputation: 400932

If "Fish" and "Fish" are the same tag, you should probably have it only once in your Tag table.

So, I would go with your first solution -- which, indeed, implies doing a select before your insert, to determine whether the tag already exists or not ; and, if it exists, using its already existing TagID for the link between the item and the tag, in the ItemTag table.

Actually, this is the reason for which the ItemTag exists : it is an association table, which stores correspondances between items and tag : for each item, you can have several tags, and, for each tag, you can have several items.

This will also, btw, make things easier to get a list of items that are attached to a specific tag.

Upvotes: 2

Related Questions