I'm in the middle of designing a new database that will need to support replication and I'm stuck on the decision what to choose as my primary key. In our current database for primary key we use two int columns, first column is identity and the other is used to describe on which server the row is inserted. Now I want to avoid using two columns for primary key, and just use one column instead. So far I have two way of doing this: Use GUID for my primary key This one will ensure that there is always a unique key across any number of servers. What I don't like with this one, is that the GUID is 16bytes in size, and when used for foreign key across many tables it will waste space. Also it is harder to use it when writing queries, and it will slower to query. Use the int or bigint, and manually specify the seed and increment value for every table on each server. For example if there is two servers, the X table on the first server will start from number 1, and on second server it will start from number 2, each will increment by 2. So there would be (1,3,5,...) on first, and (2,4,6,...) on second server. The good thing with this design is that it is easier to use when writing queries, it's fast, and it use less space for foreign keys. Bad thing is that we never know how many servers will be running so it's harder to tell what will be the increment value. Also it's harder to manage the schema change on server. What is the best practice for managing multiple servers, and what's the best way, if any, to do in this kind if situations?

Reputation: 1194

Managing identity/primary key across servers

I'm in the middle of designing a new database that will need to support replication and I'm stuck on the decision what to choose as my primary key.

In our current database for primary key we use two int columns, first column is identity and the other is used to describe on which server the row is inserted. Now I want to avoid using two columns for primary key, and just use one column instead. So far I have two way of doing this:

Use GUID for my primary key

This one will ensure that there is always a unique key across any number of servers. What I don't like with this one, is that the GUID is 16bytes in size, and when used for foreign key across many tables it will waste space. Also it is harder to use it when writing queries, and it will slower to query.
Use the int or bigint, and manually specify the seed and increment value for every table on each server. For example if there is two servers, the X table on the first server will start from number 1, and on second server it will start from number 2, each will increment by 2. So there would be (1,3,5,...) on first, and (2,4,6,...) on second server. The good thing with this design is that it is easier to use when writing queries, it's fast, and it use less space for foreign keys. Bad thing is that we never know how many servers will be running so it's harder to tell what will be the increment value. Also it's harder to manage the schema change on server.

What is the best practice for managing multiple servers, and what's the best way, if any, to do in this kind if situations?

Upvotes: 8

Answers (3)

Charl

Reputation: 1002

Your question is a good one, and one that is asked often.

From a maintenance perspective, I would absolutely go with GUIDS. They are there for a reason. Somewhere along the line you might run into complex operations moving and re-replicating your data, and then the other options can make it a little more complex than it needs to be.

There is a very nice short read about the various options here:

http://msdn.microsoft.com/en-us/library/bb726011.aspx

As for the Replication part - if done properly, there are no real headaches with replication.

Upvotes: 2

Robert Cutajar

Reputation: 3701

I dare to advise against replication altogether :) it's certainly more pain than fun. If you can afford, look into Sync framework.

Playing with identity is not flexible to say the least. Consider adding moving servers. Identity insert, different schemas and so on.

GUID would be alright for a clustered key if you used newsequentialid() as a default value. It is a bit larger (a number of bits), but it solves the hassle once and for good :)

The way I'd go is to have an int identity clustered key which is only relevant to the database context. Then add a GUID column, which makes sense to the synchronization context. Top it up with a rowversion column to see what's ready for sync.

Upvotes: 0

Bryan

Reputation: 17703

Update:

Found a more simple/manual method here. Involves using NOT FOR REPLICATION and staggering identity seed values as you mentioned in comments.

Original:

Your best bet is something like the second option listed. Assign identity ranges for the replication publisher and subscriber instances, then turn on automatic range management.

This article discusses options for managing identity columns in replication, and enabling identity range management is discussed here.

Since you don't know how many servers will be in the replication pool, you may need to reconfigure article properties periodically.

Upvotes: 0

Managing identity/primary key across servers

Answers (3)

Related Questions