Create an identity column that depends on another column

Question

Let's say that I have a table with the following columns

dbo.Orders (
    customerNumber INT,
    orderNumber INT,
    cost FLOAT, -- its just an example, there are other columns
    ...
)

I want the order number to autoincrement but only for the given customer. So for example, if I was to execute the following

INSERT INTO
    dbo.Orders (customerNumber, cost)
VALUES
    (1, 2500), -- would take orderNumber: 1 
    (1, 3500), -- would take orderNumber: 2
    (1, 1500), -- would take orderNumber: 3
    (2, 650), -- would take orderNumber: 1, because the customer is 2
    (2, 50), -- would take orderNumber: 2
    (1, 100), -- would take orderNumber: 4, because customers 1 last order was 3

And so on, such that there will be always an unique combination of customerNumber and orderNumber

I know I can enforce the uniqueness of the pair of values using an UNIQUE KEY with both columns, but I could not find anything about an IDENTITY using two columns. I found you can get something similar to an IDENTITY creating a sequence, but I could not find something that would allow me to have many sequences one per customer.

The only thing I can think off is to read the current value, then add one and then insert it, like so

SELECT @lastOrderNumber = orderNumber 
FROM dbo.Orders
WHERE customerNumber = @Customer

SET @newOrderNumber = ISNULL(@lastOrderNumber,0) + 1;

INSERT INTO dbo.Orders (customerNumber, cost, orderNumber)
VALUES (@Customer, 2500, @newOrderNumber)

But I think it becomes kind of cumbersome, its more difficult to do multiple inserts at once, and also becomes succeptible to racing conditions such as two insertions on the same customer nearly at the same time, something identities are very good at handling. I don't know if there is a better way

Thank you so much in advance

Charlieface · Accepted Answer

If you want to keep this simple and just use basic locking, it's not that hard. You don't need intermediate variables, but you do need correct locking hints. The locking hints used below, along with the index, will be all you need to ensure correctness.

INSERT INTO
    dbo.Orders (customerNumber, cost, orderNumber)
SELECT v.*,
  ISNULL(o.orderNumber, 0) + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES
    (1, 2500),
    (1, 3500),
    (1, 1500),
    (2, 650),
    (2, 50),
    (1, 100)
) AS v(customerNumber, cost)
OUTER APPLY (
    SELECT TOP (1)
        o.orderNumber
    FROM dbo.Orders o WITH (SERIALIZABLE, UPDLOCK)  -- the UPDLOCK is absolutely essential!!
    WHERE o.customerNumber = v.customerNumber
    ORDER BY
        o.orderNumber DESC
) AS o;

You must put an index on the table (customerNumber ASC, orderNumber DESC) otherwise you'll get huge locking problems and slow query plans.

If you are set on using sequences, then given the relatively low numbers of possible customerNumbers, you could create a sequence per value. You can automate this using triggers.

I'd advise shoving them into another schema for easy management sake.

CREATE OR ALTER TRIGGER tr_CreateCustomerSequence ON dbo.Customer
AFTER INSERT, DELETE
AS

SET NOCOUNT ON;

DECLARE @sql nvarchar(max);

SELECT @sql = STRING_AGG(CONCAT(N'
CREATE SEQUENCE Sequences.Customer_', i.customerNumber, ' AS int MINVALUE 1 NOCACHE'), '')
FROM inserted i
WHERE NOT EXISTS (SELECT 1
    FROM sys.sequences seq
    WHERE seq.name = CONCAT(N'Customer_', i.customerNumber)
);

EXEC @sql;


SELECT @sql = STRING_AGG(CONCAT(N'
DROP SEQUENCE Sequences.Customer_', i.customerNumber), '')
FROM deleted;

EXEC @sql;

Then you can use them pretty simply, using sp_sequence_get_range to get the values dynamically. The only complication is that you need to split the insert batch per customer, not once for all customer orders.

DECLARE @customerNumber int = 1;   -- or a parameter
DECLARE @numberOfInsertRows bigint = 3;

DECLARE @sequence_name nvarchar(776) = CONCAT(N'Sequences.Customer_', @customerNumber);
DECLARE @range_first_value int;

EXEC sp_sequence_get_range
  @sequence_name = @sequence_name,
  @range_size = @range_size,
  @range_first_value = @range_first_value OUT;


INSERT INTO
    dbo.Orders (customerNumber, cost, orderNumber)
SELECT v.*,
  @range_first_value + ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1
FROM (VALUES
    (1, 2500),
    (1, 3500),
    (1, 1500)
) AS v(customerNumber, cost);

You can alternatively call sp_sequence_get_range from client-side app code, and then pass the correct values in.

Create an identity column that depends on another column

Answers (2)

Related Questions