user1263981
user1263981

Reputation: 3147

Is this okay to have a Alphanumeric field as a PrimaryKey?

I am rewriting a new timesheet application including redesigning database and it will require data migration from Oracle to Oracle.

In the old system field ‘EmployeeCod’ is a Primary Key and it is in Alphanumeric form i.e. ‘UK001’, ‘UK002’,‘FR001’,’FR002’, ‘US001’ . Employee table is also linked to timesheet and other tables where the EmpCode is being referred as a FK.

To make the JOINs perform faster in the new system I was thinking about adding a new INT column in the Employee table and set it to PK. (Don't know if it will make any big difference)

-Employee table has about 600 rows.

-Data type of EmpCode is Varchar2(20) in old DB which I can reduce to Varchar2(6) in the new system and alter it later as company expends.

I am wondering if it is better to keep the EmpCode as a Primary Key which will make things easier in migrating data or should I add a INT column?

Someone has given me following advise in one of my previous thread:

“if you need to create a composite code of AANNN then I'd split this into two: a simple 'Prefix' field of CHAR(2) and an identity field of INT, then turn EmpCode into a computed field that concats the two and stick an index on there that (@Chris)”

I am not sure if this option would work as employee table is linked to other tables as well. (EmpCode is being used as FK in other tables) n

Upvotes: 0

Views: 3874

Answers (5)

Walter Mitty
Walter Mitty

Reputation: 18940

If you do add this PK, and also keep the former PK, you will have some data management issues to deal with. Or perhaps your customers. Getting rid of the old PK may not be feasable if there are existing users who will be upgrading to the new database.

If EmployeeCode, the former PK is used by the users of the data to identify Employees, then you will have to add a constraint to make sure that this field is unique. Carrying both codes will wipe out any performance gains you were hoping for.

If it were me, I'd leave well enough alone. The performance gains, if any, will be trivial.

Upvotes: 1

Neville Kuyt
Neville Kuyt

Reputation: 29629

There will be no performance improvement - in fact, unless you know and can prove/measure that you have a performance problem, changing things "to make them faster" usually leads to pain.

However, there is a concern that your primary key appears to carry meaning - it's a country code, concatenated with a number. What if an employee moves from the US to the UK? What if the UK hires its 1000th employee?

For that reason, I'd refactor the application to use a meaningless primary key; whether it's an INT or a VARCHAR is not hugely relevant.

Upvotes: 1

Doobi
Doobi

Reputation: 4842

I use alphanumeric primary keys regularly and see absolutely no issues with it. There is no performance issue, you have a wider addressable space, and you can be more expressive/human readable. Integer keys are just a convention.

Add to that the risk you're adding to you project by adding a major architectural change over and above the porting issues, I'd say stick with the existing schema as much as possible.

Upvotes: 1

Mike Perrenoud
Mike Perrenoud

Reputation: 67898

The performance difference will be negligible if the index you're creating on the alphanumeric field is the clustered index for the table. Which, based off of your question is going to be the case, but I wanted to note that for completeness. I say this for two reasons:

  1. A clustered index is the physical order of the table and so when seeking against that index, looking for more data presumably off of the data page in a query, a binary search can be performed against it because it's also physically stored in that order.
  2. A binary search is just about as efficient as you can get, lest we forget though a statistical index. I call this out because integer primary keys build statistical indexes which are as fast a seek as you can get because mathmatically speaking we know 2 comes after 1 for example.

So, just keep that in mind when building alphanumeric, or even compound, keys and indexes and trying to compare the difference between them and an integer key. Personally, I prefer to stick with integer primary keys because I have found them to perform better over time during extreme growth.

I hope this helps.

Upvotes: 1

Matt
Matt

Reputation: 6953

You do occassionally come across alphanumeric primary keys.. personally I find it just makes life more difficult.. if you are able to change it and you want to change it, I would say go ahead.. it will make things easier for you later. As for it being an FK, you would need to be careful to write a script to properly update all the data. One way you can do this is:

Step 1: Create a new int column for the PK and set Identity Insert to true
Step 2: Add a new int column in your child table and then:
Step 3: write an update script like this:

    UPDATE childTable C
    INNER JOIN parentTable P ON C.oldEmpID = P.oldEmpID
    SET C.myNewEmpIDColumn = P.myNewEmpIDColumn

Step 4: Repeat steps 2 & 3 for all child tables
Step 5: Delete all old FK columns

Something like that and don't forget to backup your current DB first ;)

Upvotes: 0

Related Questions