sqldatabasepostgresqldatabase-schemapostgresql-9.3

Reputation: 35

What does PRIMARY KEY actually signify, and does my table need one?

I have a PostgreSQL 9.3 database with a users table that stores usernames in their case-preserved format. All queries will be case insensitive, so I should have an index that supports that. Additionally, usernames must be unique, regardless of case.

This is what I have come up with:

forum=> \d users
                      Table "public.users"
   Column   |           Type           |       Modifiers
------------+--------------------------+------------------------
 name       | character varying(24)    | not null
Indexes:
    "users_lower_idx" UNIQUE, btree (lower(name::text))

Expressed in standard SQL syntax:

CREATE TABLE users (
    name varchar(24) NOT NULL
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));

With this schema, I've satisfied all my constraints, albeit without a primary key. The SQL standard doesn't support functional primary keys, so I cannot promote the index:

forum=> ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_idx;
ERROR:  index "users_lower_idx" contains expressions
LINE 1: ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_id...
                              ^
DETAIL:  Cannot create a primary key or unique constraint using such an index.

But, I already have the UNIQUE constraint, and the column is already marked "NOT NULL." If I had to have a primary key, I could construct the table like this:

CREATE TABLE users (
    name varchar(24) PRIMARY KEY
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));

But then I'll have two indexes, and that seems wasteful and unnecessary to me. So, does PRIMARY KEY mean anything special to postgres beyond "UNIQUE NOT NULL," and am I missing anything by not having one?

Upvotes: 3

Answers (3)

FuzzyChef

Reputation: 4061

Short answer: No, you don't need a declarative "PRIMARY KEY", since the UNIQUE index serves the same exact purpose.

Long answer:

The idea of having Primary Keys comes from database systems where the data is physically in key order. This requires having a single, "primary" key. MySQL InnoDB is this way, as are many older databases.

However, PostgreSQL does not keep the tables in key order; it separates the indexes, including the primary key index, from the heap, which is essentially unordered. As a result, in Postgres, there is no material difference between primary keys and unique indexes. You can even create a foreign key against a unique index, as long as that index covers the whole table.

That being said, some tools external to PostgreSQL look for primary keys and do not regard unique indexes as being equivalent. These tools may cause you issues because of not finding a PK.

Upvotes: 0

Erwin Brandstetter

Reputation: 656982

First off, practically every table should have a primary key.

`citext`

The additional module provides a data type of the same name. "ci" for case insensitive. Per documentation:

The citext module provides a case-insensitive character string type, citext. Essentially, it internally calls lower when comparing values. Otherwise, it behaves almost exactly like text.

It is intended for exactly the purpose you describe:

The citext data type allows you to eliminate calls to lower in SQL queries, and allows a primary key to be case-insensitive.

Bold emphasis mine.
Be sure to read the manual about limitations first. Install it once per database with

CREATE EXTENSION citext;

`text`

If you don't want to go that route, I suggest you add a serial as surrogate primary key.

CREATE TABLE users (
    user_id serial PRIMARY KEY
  , username text  NOT NULL
);

I would use text instead of varchar(24). Use a CHECK constraint if you need to enforce a maximum length (that may change at a later time). Details:

Along with the UNIQUE index in your original design (without type cast):

CREATE UNIQUE INDEX users_username_lower_idx ON users (lower(username));

The underlying integer of a serial is small and fast and does not have to waste time with lower() or the collation of your database. That's particularly useful for foreign key references. I mostly prefer that over some natural primary key with varying properties.

Both solutions have pros and cons.

Upvotes: 5

ZeRaTuL_jF

Reputation: 592

I would suggest using a primary key, as you have stated you want something that is unique, and as you have demonstrated that you can put unique constraints on a username. I will assume that since this is a unique,not null username that you will use this to track your users in other parts of the Database, as well as allow usernames to be changed. This is where a primary key will come in handy, instead of having to go into all of your tables and change the value of the Username column, you will only have one place to change it. Example

   Without primary key:
   Table users
   Username
   'Test'


   Table thingsdonebyUsers
   RandomColumn AnotherColumn  Username
   RandomValue   RandomValue    Test

Now assume your user wants to change his username to Test1, well now you have to go find everywhere you used Username and change that to the new value before you change it in your users table since I'm assuming you will have a constraint there.

  With Primary Key
  Table users
  PK           Username
  1              'Test'


  Table thingsdonebyUsers
  RandomColumn   AnotherColumn     PK_Users
  RandomValue    RandomValue         1

Now you can just change your users table and be done with the change. You can still enforce unique and not null on your username column as you demonstrated. This is just one of the many advantages of having normalized tables, which requires your tables to have a Primary Key that is an unrelated value(forget what the proper name is for this right now).

As for what a PK actually signifies, it just a non nullable unique column that identifies the row, so in this sense you already have a Primary Key on your table. The thing is that usually PKs are INT numbers because of the reason that I explained above.

Upvotes: 3

What does PRIMARY KEY actually signify, and does my table need one?

Answers (3)

citext

text

Related Questions

`citext`

`text`