AngocA
AngocA

Reputation: 7693

Way to migrate a create table with sequence from postgres to DB2

I need to migrate a DDL from Postgres to DB2, but I need that it works the same as in Postgres. There is a table that generates values from a sequence, but the values can also be explicitly given.

Postgres

create sequence hist_id_seq;

create table benchmarksql.history (
  hist_id  integer not null default nextval('hist_id_seq') primary key,
  h_c_id   integer,
  h_c_d_id integer,
  h_c_w_id integer,
  h_d_id   integer,
  h_w_id   integer,
  h_date   timestamp,
  h_amount decimal(6,2),
  h_data   varchar(24)
);

(Look at the sequence call in the hist_id column to define the value of the primary key)

The business logic inserts into the table by explicitly providing an ID, and in other cases, it leaves the database to choose the number.

If I change this in DB2 to a GENERATED ALWAYS it will throw errors because there are some provided values. On the other side, if I create the table with GENERATED BY DEFAULT, DB2 will throw an error when trying to insert with the same value (SQL0803N), because the "internal sequence" does not take into account the already inserted values, and it does not retry with a next value.

And, I do not want to restart the sequence each time a provided ID was inserted.

This is the problem in BenchmarkSQL when trying to port it to DB2: https://sourceforge.net/projects/benchmarksql/ (File sqlTableCreates)

How can I implement the same database logic in DB2 as it does in Postgres (and apparently in Oracle)?

Upvotes: 0

Views: 900

Answers (1)

Clockwork-Muse
Clockwork-Muse

Reputation: 13056

You're operating under a misconception: that sources external to the db get to dictate its internal keys. Ideally/conceptually, autogenerated ids will never need to be seen outside of the db, as conceptually there should be unique natural keys for export or reporting. Still, there are times when applications will need to manage some ids, often when setting up related entities (eg, JPA seems to want to work this way).

However, if you add an id value that you generated from a different source, the db won't be able to manage it. How could it? It's not efficient - for one thing, attempting to do so would do one of the following

  1. Be unsafe in the face of multiple clients (attempt to add duplicate keys)
  2. Serialize access to the table (for a potentially slow query, too)

(This usually shows up when people attempt something like: SELECT MAX(id) + 1, which would require locking the entire table for thread safety, likely including statements that don't even touch that column. If you try to find any "first-unused" id - trying to fill gaps - this gets more complicated and problematic)

Neither is ideal, so it's best to not have the problem in the first place. This is usually done by having id columns be autogenerated, but (as pointed out earlier) there are situations where we may need to know what the id will be before we insert the row into the table. Fortunately, there's a standard SQL object for this, SEQUENCE. This provides a db-managed, thread-safe, fast way to get ids. It appears that in PostgreSQL you can use sequences in the DEFAULT clause for a column, but DB2 doesn't allow it. If you don't want to specify an id every time (it should be autogenerated some of the time), you'll need another way; this is the perfect time to use a BEFORE INSERT trigger;

CREATE TRIGGER Add_Generated_Id NO CASCADE BEFORE INSERT ON benchmarksql.history
               NEW AS Incoming_Entity
               FOR EACH ROW
               WHEN Incoming_Entity.id IS NULL
               SET id = NEXTVAL FOR hist_id_seq

(something like this - not tested. You didn't specify where in the project this would belong)

So, if you then add a row with something like:

INSERT INTO benchmarksql.history (hist_id, h_data) VALUES(null, 'a')

or

INSERT INTO benchmarksql.history (h_data) VALUES('a')

an id will be generated and attached automatically. Note that ALL ids added to the table must come from the given sequence (as @mustaccio pointed out, this appears to be true even in PostgreSQL), or any UNIQUE CONSTRAINT on the column will start throwing duplicate-key errors. So any time your application needs an id before inserting a row in the table, you'll need some form of

SELECT NEXT VALUE FOR hist_id_seq
FROM sysibm.sysdummy1

... and that's it, pretty much. This is completely thread and concurrency safe, will not maintain/require long-term locks, nor require serialized access to the table.

Upvotes: 1

Related Questions