Cyril
Cyril

Reputation: 207

Is ID column always required in SQL?

to be more specific, I've created a news module with a tag system. As each tag is unique (you are not allowed, as an admin, to create 2 identical tags), is the id column still useful? I guess no, but I was wondering about the performances.

id | mews_title | date ... <-------------> news_id | tag_id <----------> id | tag_name

VS

id | mews_title | date ... <-------------> news_id | tag_tag_name <----------> tag_name

Thanks a lot!

Upvotes: 3

Views: 1746

Answers (3)

Bill Karwin
Bill Karwin

Reputation: 562891

The performance difference is insignificant.

Advantages to using a numeric id for the tags in your example would be:

  • to make the intersection table somewhat smaller because integers are smaller on average than a string
  • to allow changing the spelling of a tag name by updating one row instead of many rows

These may not be important considerations for your case. So no, it's not required to use a numeric id.

I also wrote about this in a chapter titled "ID Required" in my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.

Upvotes: 6

nachospiu
nachospiu

Reputation: 2049

Numeric IDs are not required (You could have a table without a primary key), but they are useful:

  • Without and ID column, if you want to change a tag_name in the future (because for example it is misspelled), you will have to update all the tables that have this foreign key (tag_tag_name). If you have and ID column you only will have to change the tag_name in one place.
  • They could be auto generated (Autoincremental).
  • It could take up less space. For example when the natural primary key is composite and is the foreign key in another table, and id (one integer column) primary key take up less space.
  • It's easier make joins with other tables (If the primary key has only a column than if it has more than one).
  • if all yours table's primary key is the ID field, you don't have to remember others field names when you make joins, your data schema syntax will be consistent.

You can specify the tag_name as unique, so two tag_name equals couldn't be inserted.

Upvotes: 2

Rick James
Rick James

Reputation: 142528

You don't even need a table of tags. Just a table with

news_id | tag_name

This is a many-to-many mapping between news_ids and tags. Since there is no tag table, the issue of duplicates vanishes. The same tag (tag_name) can be associated with many news articles, but that is what you want.

INSERT INTO Tags (news_id, tag_name) -- change this to INSERT IGNORE INTO Tags (news_id, tag_name) so you don't have to worry about getting the same news article tagged the same way twice. And have

PRIMARY KEY(news_id, tag_name), -- uniqueness and lookup tags for one article
INDEX(tag_name, news_id)  -- For finding all articles for one tag

Upvotes: 2

Related Questions