logjammin
logjammin

Reputation: 1211

In PostgreSQL, add a column with unique values according to specific column ordering / sorting

Background

I've got a PostgreSQL table, db1, that's got several columns in it but no primary key. I need to add one. Postgres is version 13-point-something, running on a local server on a 64bit Windows PC. Here's a toy version of the table:

+-------------------+------------+
|member             |service_date|
+-------------------+------------+
|eof81j4            |2011-01-06  |
|eof81j4            |2010-06-03  |
|eof81j4            |2010-01-12  |
|eof81j4            |2011-05-21  |
|j42roit            |2015-11-29  |
|j42roit            |2015-11-29  |
|j42roit            |2015-11-29  |
|p8ur0fq            |2014-01-13  |
|p8ur0fq            |2016-04-04  |
|p8ur0fq            |2014-01-13  |
|vplhbun            |2019-08-15  |
|vplhbun            |2019-08-15  |
|vplhbun            |2019-08-15  |
|akj3vie            |2009-03-31  |
+-------------------+------------+

Note here that the table isn't ordered in any obvious way: the member column isn't in ABC order, say, and the service_date column isn't ordered chronologically or anything.

The Problem

Typically, I'd add a primary key to a table like so:

ALTER TABLE db1 ADD COLUMN id SERIAL PRIMARY KEY

And that works just fine. However, because of some queries I have to run later on, I'd like the primary key to be applied as if the table were ordered (sorted, in other words) first by member (ascending alphabetical order) and, within that, chronologically (from oldest to newest) by service date. In other words I'd like something like this:

+-------------------+------------+--+
|member             |service_date|id|
+-------------------+------------+--+
|akj3vie            |2009-03-31  |1 |
|eof81j4            |2010-01-12  |2 |
|eof81j4            |2010-06-03  |3 |
|eof81j4            |2011-01-06  |4 |
|eof81j4            |2011-05-21  |5 |
|j42roit            |2015-11-29  |6 |
|j42roit            |2015-11-29  |7 |
|j42roit            |2015-11-29  |8 |
|p8ur0fq            |2014-01-13  |9 |
|p8ur0fq            |2014-01-13  |10|
|p8ur0fq            |2016-04-04  |11|
|vplhbun            |2019-08-15  |12|
|vplhbun            |2019-08-15  |13|
|vplhbun            |2019-08-15  |14|
+-------------------+------------+--+

What I've Tried

Attempts at cracking this have gone something like this:

ALTER TABLE db1 ADD COLUMN id SERIAL PRIMARY KEY ORDER BY member, service_date 

But that yields [42601] ERROR: syntax error at or near "ORDER". I've looked for similar posts on SO, but solutions seem to involve creating a new table, and I'd prefer not to do that (as the real table I'm going to be doing this on is very very large).

EDIT

Upon reflection, and thanks in part to Gordon Linoff's answer below, I'm realizing that I don't actually need this new column to be a Primary Key in the formal sense that Postgres recognizes it as such. All I really need is a column with unique values for each row that are ordered in the way I described. Whether it's formally a PK or not doesn't actually matter. (In the original title and body of this post, I had asked for a Primary Key. I've edited things to reflect that that's not strictly necessary.)

Upvotes: 1

Views: 1615

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269973

One approach is to add the id column, set the value and then convert it to an identity. You have to be careful, because you want the numbering to start at 15 (for your sample data):

alter table db1 add column id int not null default 0;
update db1
    set id = d.new_id
    from (select db1.*, db1.ctid, row_number() over (order by member, service_date) as new_id
          from db1
         ) d
    where db1.ctid = d.ctid;

alter table db1 alter column id drop default;

alter table db1 alter column id add generated always as identity  (start with 15) ;

alter table db1 add primary key (id);

Here is a db<>fiddle.

Upvotes: 2

Related Questions