user9393635
user9393635

Reputation: 1429

how common is it to use both a guid and an int as unique identifiers for a table?

I think a Guid is generally the preferred unique table row identifier from a dba perspective. But I'm working on a project where the developers and managers appear to want a way to reference things by an int value. I can understand their perspective b/c they want a simple and easy way to reference different entities.

I was thinking about using a pattern for my tables where each table would have an int Id column representing the PK column but then it would also include a Guid column as a globally unique identifier. How common is it to use this type of pattern?

Upvotes: 2

Views: 2485

Answers (2)

Dan Guzman
Dan Guzman

Reputation: 46203

I would not say GUID is generally preferred from a DBA perspective. It is larger (16 bytes rather than 4 for int or 8 for bigint) and the random variety introduces fragmentation and causes much more IO with large tables due to lower page life expectancy. This is especially a problem with spinning media and limited RAM.

When a GUID is actually needed, some of these issues can be avoided using a sequential version for the GUID value rather than introducing another surrogate key. The value can be assigned in by SQL Server with a NEWSEQUENTIALID() default constraint on a column or generated in application code with the bytes ordered properly for SQL Server. Below is a Windows C# example of the latter technique.

using System;
using System.Runtime.InteropServices;

public class Example
{
    [DllImport("rpcrt4.dll", CharSet = CharSet.Auto)]
    public static extern int UuidCreateSequential(ref Guid guid);

    /// sequential guid for SQL Server
    public static Guid NewSequentialGuid()
    {
        const int S_OK = 0;
        const int RPC_S_UUID_LOCAL_ONLY = 1824;

        Guid oldGuid = Guid.Empty;

        int result = UuidCreateSequential(ref oldGuid);
        if (result != S_OK && result != RPC_S_UUID_LOCAL_ONLY)
        {
            throw new ExternalException("UuidCreateSequential call failed", result);
        }

        byte[] oldGuidBytes = oldGuid.ToByteArray();
        byte[] newGuidBytes = new byte[16];
        oldGuidBytes.CopyTo(newGuidBytes, 0);

        // swap low timestamp bytes (0-3)
        newGuidBytes[0] = oldGuidBytes[3];
        newGuidBytes[1] = oldGuidBytes[2];
        newGuidBytes[2] = oldGuidBytes[1];
        newGuidBytes[3] = oldGuidBytes[0];

        // swap middle timestamp bytes (4-5)
        newGuidBytes[4] = oldGuidBytes[5];
        newGuidBytes[5] = oldGuidBytes[4];

        // swap high timestamp bytes (6-7)
        newGuidBytes[6] = oldGuidBytes[7];
        newGuidBytes[7] = oldGuidBytes[6];

        //remaining 8 bytes are unchanged (8-15) 

        return new Guid(newGuidBytes);

    }

}

Upvotes: 3

sniperd
sniperd

Reputation: 5274

In the vast majority of cases you'll want to either use an INT or BIGINT for you primekey/foreign key. For the most part you are looking to make sure that table can be joined to and have a way to easily select a single unique row. In theory using GUIDs all over the place gets you there too, if you were a robot and could quickly ask a colleague, "Hey can you check out ROW_ID FD229C39-2074-4B04-8A50-456402705C02" vs "Hey can you check out ROW_ID 523". But we are human. I don't think there is a really good reason to include another column that is simply a GUID in addition to your PK (which should be an INT or BIGINT)

It can also be nice to have your PK in an order, that seems to come in handy. GUIDs won't be in a order. However, a case for using a GUID would be if you have to expose this value to a customer. You may not want them to know they are customer #6. But being customer #B8D44820-DF75-44C9-8527-F6AC7D1D259B isn't too great if they have to call in and identify themselves, but might be fine for writing code against (say a webservice or some kind of API). SQL is a lot of art with the science!

In addition do you really need a global unique id for a row? Probably not. If you are designing a system that could use up more than what INT can handle (say total number of tweets in all time) then use BIGINT. If you can use up all the BIGINTs, wow. I'd be interested in hearing how and would like to subscribe to your newsletter.

A question I ask myself when writing stuff, "If I'm wrong how hard will it be to do the other way?". If you really need a GUID later, add it. If you put it in now and just 1 person uses it you can never take it out and it will have to be maintained... job security? nah, don't think that way :) Don't over engineer it.

Upvotes: 2

Related Questions