Reputation: 1625
Say I am working on a free for all blog where writers are allowed to tag their posts (basically twitter et all). How do I do that so that queries are optimized for speed?
I am thinking of having a tag field where the content is a list of tags
for example an entry may be an article about the president and the tags may be:
tag={politics, family, social, news, economy}
Notice that the tags are not predefined and that a writer adds them at whim/will.
Is my approach as described the best? Or is there a more efficient approach to doing this?
As it may matter, I am using the app-engine datastore and my language is java.
The ultimate objective is to allow users to search by specific tags, say a user wants to search by the tag "marriage".
Upvotes: 0
Views: 74
Reputation: 314
First off, you probably do not want any field in a database to be a list of anything, as you typically want it to stay in third normal form at worst and having a column be a list of values violates first normal form.
The way that I've done tagging with small-scale projects (haven't tried anything large so I don't know for sure how scalable or not this would be) is to have a lookup table containing the tags, and a linking table containing the key for the thing to be tagged and the key for its tags.
So it would look something like this:
Thing Thing_Tags Tags
------ ----------- ----
ThingID (Guid) ThingID (Guid) TagID (Guid)
otherAttributes TagID (Guid) Tag (varchar)
Upvotes: 2