Sthe
Sthe

Reputation: 2695

What's better between natural and composite keys in this situation

I have a table that has to store millions of posts (in the near future) in mysql. Here is the simplified structure (I didn't indicate my primary keys because my question is based on that):

CREATE TABLE `posts` (
    `post_id` INT NOT NULL AUTO_INCREMENT,
    `user_id` BIGINT(20) NOT NULL,
    `title` VARCHAR(100),
    `content` TEXT
) ENGINE = MyISAM;

My question is: What is the best way to define my keys?

  1. Is it best to use my AUTO_INCREMENTing 'post_id' only as a primary and a unique key considering the amount of records that needs to be stored?

  2. Must I use both 'post_id' and 'user_id' as a composite key to use as a primary and unique key? If this is best, how do I use it in other tables as a foreign key? Do I simply add them as columns in those tables?

Could you please also indicate the advantages and disadvantages (if any) of each and perhaps some advice on which ENGINE to use. I think Innodb will be best if I use the second option. I don't know.

Upvotes: 1

Views: 285

Answers (1)

scott.korin
scott.korin

Reputation: 2597

Whether you use an auto-incremented field as a primary key or a composite key using post_id and user_id will come down to basically the following:

If you have child tables to your posts table, will you ever want to query on those tables using the user-id of the post?

For example, if other uses are allowed to comment on a post, and you have a comments table, do you see reasons why you would want to get data from the comments table where you query on the user_id of the original post?

If so, by using a auto-incremented field, you are always going to have to join on the parent table (posts) in order to query on data on your child table based on the user_id:

SELECT comments.* 
FROM comments
INNER JOIN posts ON
    posts.post_id=comments.post_id
WHERE posts.user_id='scott.korin'

This can cause a performance hit, especially if you expect millions of rows of data in the posts table.

If you don't have the need to query on child tables using the user_id field, then I would use the auto-incremented post_id. Just make sure you define the field big enough. (if you except millions of records, you don't want to get stuck with only a maximum of a few millions records because you made the post_id field too small).

Upvotes: 1

Related Questions