bendataclear
bendataclear

Reputation: 3850

GUID vs Identity for use with tags

I know this subject has been discussed at length (I have read lots of posts here and all over the web) and I don't like to flog a dead horse but I have a question about a more specific aspect of Integer ID vs GUID:

I am writing out a schema which is going to have a tagging ability similar in use to the stackoverflow tags but it will have the same tags used against 5+ different tables.

The basic tables I will be linking are as follows:

Tag Table

Tag ID    Tag Name      Tag Description
-------------------------------------------------------------
     1    Hats          Tag for hats
     2    Coats         Tag for coats
     3    Gloves        Gloves tag
     4    Ladies        Ladies item

Items Table 1

Item ID    Item Name    Cost
------------------------------------------------------------
      1    Deerstalker  £20.00
      2    Fedora       £50.00
      3    Scarf        £15.00

The bit I'm having trouble with is the tag_item table.

I will have 5 tables with completely different structures that I want the users to be able to apply tags to so I think I'll need to do one of the following:

What I'm unsure of is how this will affect performance for:

Is there any clearly better option in this case or anywhere I could read up on the advantages in this particular scenario?


Edit:

For anyone interested, below is the schema I have decided upon:

Table for Tags

CREATE TABLE [dbo].[Sys_TagList](

    [Sys_Tag_Primary] [int] IDENTITY(1,1) NOT NULL,
    [Sys_Tag_Name] [varchar](50) NOT NULL,
    [Sys_Tag_Description] [varchar](1000) NULL

)

Table for Tag_Items

CREATE TABLE [dbo].[Meta_Tags](

    [Met_Tag_Primary] [int] IDENTITY(1,1) NOT NULL,
    [Met_Tag_Link_FK] [int] NOT NULL, -- Link to [Sys_Tag_Primary] field
    [Met_Tag_Name] [varchar](25) NOT NULL, -- To enable tag searches to use a single table
    [Met_Tag_Table] [varchar](25) NOT NULL, -- Table name, used with [Met_Tag_RowID] to create link to that item
    [Met_Tag_RowID] [int] NOT NULL -- Primary key of item[n] table

)

Upvotes: 8

Views: 425

Answers (2)

JustinC
JustinC

Reputation: 415

Obligatory possible tagging design duplicate

Depending on how you intend to use the tags and what performance characteristics are important, one possible alternative to the map/link table is a hashed combination table.

Calculate the hash of a given requested combination of tags and store that directly in your existing tables, and in the combination table.

Querying for anything that is a subset of tags might start to get kind of hacky though as you'll end up doing a lot of in memory transforms of all those hashes. Perhaps there is a cleaner way of querying hashed reference values out there, as I cant recall doing this, and didn't immediately locate anything in a search.

Upvotes: 1

Ed Harper
Ed Harper

Reputation: 21505

I'm going to (perhaps unhelpfully) suggest that GUID vs ID is the wrong question to be focusing on. The assumption that you need a single table linking tags to items may be worth examining first.

If your five item categories are different enough to require different tables then you should probably consider five tag_item link tables - one for each item table. Although the code to query them all at once (using UNION ALL to concatenate the results) may be a bit more verbose and repetetive than the code for a single link table, the queries are likely to be substantially simpler - good from both a maintenance and a query-plan perspective.

The other alternative would be to refactor the common attributes that the five item categories share into a single central table (to which the tags are linked) and store the additional non-common attributes in five child tables. It's difficult to know whether this is appropriate without some more information about how the existing data is currently used.

Upvotes: 9

Related Questions