MySQL vs XML for Tagging mechanism

Question

I want to store tags per Media (Image/Sound/Video/Pdf) uploaded by users. And in future I would like to use this tags for searching the media. I am a beginner developer, and I used one way to do this.

I made two tables

TAG (Tagid,Tagname)
Mediatags (Mediaid,Tagid)
- Media is data uploaded data by user
- tags are simple words used for searching the media again

For example If I upload an image it will have a mediaid=M1 and I will give it two Tags as Tag1, Tag2. Now

Table 1} will contain data => Row1 = T1-Tag1 ; Row2 = T2-Tag2 as schema[Tagid,Tag name]
Table 2} will contain data => Row1 = M1-T1 ; Row2 = M1-T2 as schema[Mediaid,Tagid]

It will be repeated for each uploads by users. When I have to search perticular tag for example: Searching 'Tag1'. It will find Tag id of Tag1 that is 'T1' It will look up to each meadiid in Mediatags table where tag id is 'T1'

Then I will get mediaids of those which has tags as T1=Tag1

I am using MySQL for this. Now I have another mechanism in mind,

IF I make an XML file for each tag containing all media ids which has that Tag. In this example, there will be 2 XML files TAG1.XML , Tag2.XML and both will contain data M1 It creates two steps while searching. Find the XML file of that tag. And we get all mediaids in that which has used that tag. So here, I will find XML file TAG1.XML and read it. It will give me all those media ids which has used tag M1.I think It can be help full while searching. But it will generate too much XML files! I will sort them in Alphabetical dictionaries.

I just want to know, 1. How efficient here is to use this XML approach then MySQL? 2. Finding data in MySql 2 tables then finding an XML file stored in a fixed directory, which is better?

If i used first approach, so many rows will be there in a MySql table where in second approach there will be so man XML files(an XML File will not be heavy but so many XML files will be there!)

That is why I am confused what is more preferred appraoch! Please suggest!

O. Jones · Accepted Answer

Both approaches will work. Your question is one of efficiency, not capability (if I understand you). To answer efficiency questions you need to know quite a lot about your appplication's requirements.

Here are some questions to ask yourself:

Will the data items (the artist bios) be fairly static, or will your system have lots of them coming, changing, and going all the time?
Do you plan for a lot of dynamic lookup of these items?
Will your lookup be done by tag, or by content, or both?

If your system has a low volume of changes and lookups, using XML has advantages. It's more flexible in the way the data you store is defined. The XPath query language is tricky to learn, but it works well once you get good at it. But unless you invest in installing and learning to use an XML-aware server like MarkLogic, it will not be fast. It will be slow.

Otherwise, stick with MySQL. It has some key advantages for this kind of work including its FULLTEXT query capability.

What you're doing is a subset of what's called "semantic web" work. You might do some reading on that, and on the scheme called "Dublin Core" for representing metadata. A lot of brainpower has gone into trying to figure this out. Slightly less brainpower has gone into trying to explain their results, which makes learning about the Semantic Web a hard slog, but it is still worth it.

Please consider using a content-management system like WordPress, Drupal, or Joomla! to build your first version of this system. You might also consider using a wiki platform like MediaWiki or Twiki.

You'll find that those systems support a variety of tagging (taxonomy) schemes pretty well. And as for search: it just works. "Stick-built" web applications, that start with echo "Hello, world of music!"; take a lot of repetitive and boring work before you get any results at all.

If you were building a parade float, you'd start with a truck, not a bucket of bolts, a saw, and some lumber. Same deal for a web site.

MySQL vs XML for Tagging mechanism

Answers (1)

Related Questions