sqlpostgresqldatabase-designconcurrencysequence

Reputation: 2764

Is it possible to use a PG sequence on a per record label?

Does PostgreSQL 9.2+ provide any functionality to make it possible to generate a sequence that is namespaced to a particular value? For example:

 .. | user_id | seq_id | body | ...
 ----------------------------------
  - |    4    |   1    |  "abc...."
  - |    4    |   2    |  "def...."
  - |    5    |   1    |  "ghi...."
  - |    5    |   2    |  "xyz...."
  - |    5    |   3    |  "123...."

This would be useful to generate custom urls for the user:

domain.me/username_4/posts/1    
domain.me/username_4/posts/2

domain.me/username_5/posts/1
domain.me/username_5/posts/2
domain.me/username_5/posts/3

I did not find anything in the PG docs (regarding sequence and sequence functions) to do this. Are sub-queries in the INSERT statement or with custom PG functions the only other options?

Upvotes: 6

Answers (4)

Erwin Brandstetter

Reputation: 658092

You can use a subquery in the INSERT statement like @Clodoaldo demonstrates. However, this defeats the nature of a sequence as being safe to use in concurrent transactions, it will result in race conditions and eventually duplicate key violations.

You should rather rethink your approach. Just one plain sequence for your table and combine it with user_id to get the sort order you want.

You can always generate the custom URLs with the desired numbers using row_number() with a simple query like:

SELECT format('domain.me/username_%s/posts/%s'
            , user_id
            , row_number() OVER (PARTITION BY user_id ORDER BY seq_id)
             )
FROM   tbl;

db<>fiddle here
_{Old sqlfiddle}

Upvotes: 2

Clodoaldo Neto

Reputation: 125444

insert into t values (user_id, seq_id) values
(4, (select coalesce(max(seq_id), 0) + 1 from t where user_id = 4))

Check for a duplicate primary key error in the front end and retry if needed.

Update

Although @Erwin advice is sensible, that is, a single sequence with the ordering in the select query, it can be expensive.

If you don't use a sequence there is no defeat of the nature of the sequence. Also it will not result in a duplicate key violation. To demonstrate it I created a table and made a python script to insert into it. I launched 3 parallel instances of the script inserting as fast as possible. And it just works.

The table must have a primary key on those columns:

create table t (
    user_id int,
    seq_id int,
    primary key (user_id, seq_id)
);

The python script:

#!/usr/bin/env python

import psycopg2, psycopg2.extensions

query = """
    begin;
    insert into t (user_id, seq_id) values
    (4, (select coalesce(max(seq_id), 0) + 1 from t where user_id = 4));
    commit;
"""

conn = psycopg2.connect('dbname=cpn user=cpn')
conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE)
cursor = conn.cursor()

for i in range(0, 1000):

    while True:
        try:
            cursor.execute(query)
            break
        except psycopg2.IntegrityError, e:
            print e.pgerror
            cursor.execute("rollback;")

cursor.close()
conn.close()

After the parallel run:

select count(*), max(seq_id) from t;
 count | max  
-------+------
  3000 | 3000

Just as expected. I developed at least two applications using that logic and one of then is more than 13 years old and never failed. I concede that if you are Facebook or some other giant then you could have a problem.

Upvotes: 0

Chris Farmiloe

Reputation: 14185

Maybe this answer is a little off-piste, but I would consider partitioning the data and giving each user their own partitioned table for posts.

There's a bit of overhead to the setup as you will need triggers for managing the DDL statements for the partitions, but would effectively result in each user having their own table of posts, along with their own sequence with the benefit of being able to treat all posts as one big table also.

General gist of the concept...

psql# CREATE TABLE posts (user_id integer, seq_id integer);
CREATE TABLE

psql# CREATE TABLE posts_001 (seq_id serial) INHERITS (posts);
CREATE TABLE

psql# CREATE TABLE posts_002 (seq_id serial) INHERITS (posts);
CREATE TABLE

psql# INSERT INTO posts_001 VALUES (1);
INSERT 0 1

psql# INSERT INTO posts_001 VALUES (1);
INSERT 0 1

psql# INSERT INTO posts_002 VALUES (2);
INSERT 0 1

psql# INSERT INTO posts_002 VALUES (2);
INSERT 0 1

psql# select * from posts;
 user_id | seq_id 
---------+--------
       1 |      1
       1 |      2
       2 |      1
       2 |      2
(4 rows)

I left out some rather important CHECK constraints in the above setup, make sure you read the docs for how these kinds of setups are used

Upvotes: 1

Federico Razzoli

Reputation: 5371

Yes:

CREATE TABLE your_table
(
    column type DEFAULT NEXTVAL(sequence_name),
    ...
);

More details here: http://www.postgresql.org/docs/9.2/static/ddl-default.html

Upvotes: -2

Is it possible to use a PG sequence on a per record label?

Answers (4)

Update

Related Questions