Reputation: 75
I have people uploading art on my site. They enter titles, tags, and the file.
Once it's uploaded i have tags the title, tags separated by commas and the filepath for converted image file.
I now want to find closest related art by tags to this one. So to find closest match I have to explode the tags and search for each individual one? It seems like a lot of work on the server. I was wondering can anyone tell me what's the correct way to store the tags and data and what search would I have to do?
Would I have to have one table that holds the title, and filepath and another table to hold id of the art and a column of one of the tags. Meaning if I have "pencil, animal, wildlife" I'd have three rows in the tags table with the same art ID ?
Upvotes: 1
Views: 826
Reputation: 218837
Just separate out your data entities by what they are and what they mean. For title
, tags
and file
it sounds like you have two entities:
Picture
----------
ID
Title
File
Tag
----------
ID
Name
That is, the title
and the file
(in your case I guess you're storing that as the path to the file on the file system, which is fine) are one entity, and a tag
is its own separate entity. Since each Picture
can have multiple tag
s and each tag
can relate to multiple Picture
s, it's a many-to-many relationship. So one would generally create a supporting non-entity table to link them in the database:
PictureTagRelationship
----------
PictureID
TagID
With this, you can get a Picture
:
SELECT Picture.Title, Picture.File FROM Picture WHERE Picture.ID = ?id
and its tags:
SELECT Tag.ID, Tag.Name FROM Tag
INNER JOIN PictureTagRelationship ON Tag.ID = PictureTagRelationship.TagID
WHERE PictureTagRelationship.PictureID = ?id
(You can do that in a single query in a couple of ways as well, I just split it into two for simplicity. Two queries shouldn't be a big deal, but if you need to highly optimize your database access overhead or if you really want it to be a single query then I'm sure something can be done.)
Or you can get all the pictures for a specific tag:
SELECT Picture.ID, Picture.Title, Picture.File FROM Picture
INNER JOIN PictureTagRelationship ON Picture.ID = PictureTagRelationship.PictureID
WHERE PictureTagRelationship.TagID = ?id
There are other tweaks that can be made to this design and plenty of other ways to view and report on the data. But in all of this is one key point:
Don't use comma-delimited lists to store data. Normalize each data entity into its own structure and store it accordingly. Relational databases are great for that sort of thing. But any time you store separate data elements as a delimited string, you lose that separation of those elements. This makes it more difficult to report on that data, more difficult to interact with it, a lot more difficult to update it, and less intuitive to anybody else who needs to support it.
Just remember that any one field in the database should be storing one piece of information and only one piece of information. If you have to cram multiple pieces of information into a single field then you're not using the relational database properly.
Upvotes: 1