Toni Kostelac
Toni Kostelac

Reputation: 361

Advice needed on primary key selection

I'm building a database for what is soon to be my version of a social networking site. Now, I'd like to store friend relations, sort of like facebook does. I should mention that I'm using MySQL for this.

So i'm thinking of doing something like this:

UserFriends
(
    UserFriendID SOME_DATA_TYPE NOT NULL AUTO_INCREMENT PRIMARY KEY,
    UserID BIGINT(20) UNSIGNED NOT NULL,
    FriendID BIGINT(20) UNSIGNED NOT NULL -- This is basically the same as UserID
)Engine=InnoDB;

Now, I'm looking for some type of data type to use for the primary key for this table as I expect that there will be a ton of records and I'd like some type of indexing to speed up any types of look-up that I might do on the records. Such as a friend suggestion feature etc.

I'm open to suggestions. Another option, in my opinion, but much more difficult to manage is to dynamically create a separate table for each user and store their friends in them, however this would be sort of a nightmare to manage code-wise.

Upvotes: 0

Views: 183

Answers (4)

iouri
iouri

Reputation: 2929

If you are expecting to have enough records to fill INT data type, MySQL is not the right solution, especially for recommendations, multi level friend-of-friend-of-friend etc. It might be more suited for one of Graph databases out there. Neo4j is a good example, designed specifically for social networks. http://neo4j.org check it out, might be a good alternative. You don't have to get rid of mysql, it most likely will be a hybrid approach.

Upvotes: 0

If you do something like this

create table UserFriends
(
    UserFriendID SOME_DATA_TYPE NOT NULL AUTO_INCREMENT PRIMARY KEY,
    UserID BIGINT(20) UNSIGNED NOT NULL,
    FriendID BIGINT(20) UNSIGNED NOT NULL -- This is basically the same as UserID
) Engine=InnoDB;

then you'll probably end up with data that looks like this.

UserFriendID  UserID  FriendID
--
1             100     201
2             100     201
3             201     100

The problem with that should be obvious.

If you don't need to know who friended whom, then something like this would make more sense. (Standard SQL, not MySQL.)

create table UserFriends (
    UserID BIGINT(20) UNSIGNED NOT NULL,
    FriendID BIGINT(20) UNSIGNED NOT NULL,
    primary key (UserID, FriendID),
    check (UserID < FriendID),
    foreign key (UserID) references users (UserID),
    foreign key FriendID references users (UserID)
);

The primary key constraint guarantees that you don't have multiple identical rows for a single "friendship". The check() constraint guarantees that you don't have two rows, differing only in the order of the id numbers, for a single "friendship".

But because MySQL doesn't enforce check() constraints, you'll have to write a trigger to make sure that UserID is less than FriendID.

Upvotes: 1

AndreKR
AndreKR

Reputation: 33707

Just use INT. There are lots of methods to optimize performance, choosing an unusual primary key data type is not one of them.

Don't create one table per user. If you really have a lot of users, you can split them by some shard key later when you know where your bottlenecks are.

Upvotes: 0

Randy
Randy

Reputation: 16673

use the same pattern BIGINT(20)

avoid a table per user like the plague :)

Upvotes: 0

Related Questions