Reputation: 35
I have a PostgreSQL 9.3 database with a users table that stores usernames in their case-preserved format. All queries will be case insensitive, so I should have an index that supports that. Additionally, usernames must be unique, regardless of case.
This is what I have come up with:
forum=> \d users
Table "public.users"
Column | Type | Modifiers
------------+--------------------------+------------------------
name | character varying(24) | not null
Indexes:
"users_lower_idx" UNIQUE, btree (lower(name::text))
Expressed in standard SQL syntax:
CREATE TABLE users (
name varchar(24) NOT NULL
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));
With this schema, I've satisfied all my constraints, albeit without a primary key. The SQL standard doesn't support functional primary keys, so I cannot promote the index:
forum=> ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_idx;
ERROR: index "users_lower_idx" contains expressions
LINE 1: ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_id...
^
DETAIL: Cannot create a primary key or unique constraint using such an index.
But, I already have the UNIQUE constraint, and the column is already marked "NOT NULL." If I had to have a primary key, I could construct the table like this:
CREATE TABLE users (
name varchar(24) PRIMARY KEY
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));
But then I'll have two indexes, and that seems wasteful and unnecessary to me. So, does PRIMARY KEY mean anything special to postgres beyond "UNIQUE NOT NULL," and am I missing anything by not having one?
Upvotes: 3
Views: 514
Reputation: 4061
Short answer: No, you don't need a declarative "PRIMARY KEY", since the UNIQUE index serves the same exact purpose.
Long answer:
The idea of having Primary Keys comes from database systems where the data is physically in key order. This requires having a single, "primary" key. MySQL InnoDB is this way, as are many older databases.
However, PostgreSQL does not keep the tables in key order; it separates the indexes, including the primary key index, from the heap, which is essentially unordered. As a result, in Postgres, there is no material difference between primary keys and unique indexes. You can even create a foreign key against a unique index, as long as that index covers the whole table.
That being said, some tools external to PostgreSQL look for primary keys and do not regard unique indexes as being equivalent. These tools may cause you issues because of not finding a PK.
Upvotes: 0
Reputation: 656982
First off, practically every table should have a primary key.
citext
The additional module provides a data type of the same name. "ci" for case insensitive. Per documentation:
The
citext
module provides a case-insensitive character string type,citext
. Essentially, it internally callslower
when comparing values. Otherwise, it behaves almost exactly liketext
.
It is intended for exactly the purpose you describe:
The
citext
data type allows you to eliminate calls to lower in SQL queries, and allows a primary key to be case-insensitive.
Bold emphasis mine.
Be sure to read the manual about limitations first. Install it once per database with
CREATE EXTENSION citext;
text
If you don't want to go that route, I suggest you add a serial
as surrogate primary key.
CREATE TABLE users (
user_id serial PRIMARY KEY
, username text NOT NULL
);
I would use text
instead of varchar(24)
. Use a CHECK
constraint if you need to enforce a maximum length (that may change at a later time). Details:
Along with the UNIQUE
index in your original design (without type cast):
CREATE UNIQUE INDEX users_username_lower_idx ON users (lower(username));
The underlying integer
of a serial
is small and fast and does not have to waste time with lower()
or the collation of your database. That's particularly useful for foreign key references. I mostly prefer that over some natural primary key with varying properties.
Both solutions have pros and cons.
Upvotes: 5
Reputation: 592
I would suggest using a primary key, as you have stated you want something that is unique, and as you have demonstrated that you can put unique constraints on a username. I will assume that since this is a unique,not null username that you will use this to track your users in other parts of the Database, as well as allow usernames to be changed. This is where a primary key will come in handy, instead of having to go into all of your tables and change the value of the Username column, you will only have one place to change it. Example
Without primary key:
Table users
Username
'Test'
Table thingsdonebyUsers
RandomColumn AnotherColumn Username
RandomValue RandomValue Test
Now assume your user wants to change his username to Test1, well now you have to go find everywhere you used Username and change that to the new value before you change it in your users table since I'm assuming you will have a constraint there.
With Primary Key
Table users
PK Username
1 'Test'
Table thingsdonebyUsers
RandomColumn AnotherColumn PK_Users
RandomValue RandomValue 1
Now you can just change your users table and be done with the change. You can still enforce unique and not null on your username column as you demonstrated. This is just one of the many advantages of having normalized tables, which requires your tables to have a Primary Key that is an unrelated value(forget what the proper name is for this right now).
As for what a PK actually signifies, it just a non nullable unique column that identifies the row, so in this sense you already have a Primary Key on your table. The thing is that usually PKs are INT numbers because of the reason that I explained above.
Upvotes: 3