Abdón Araya
Abdón Araya

Reputation: 186

Making an INSERT INTO as error/duplicate proof as possible without ID

I have a stored procedure that inserts from tableA to tableB. Each time the procedure is executed (because tableA will get updated constantly so when the procedure is run it will add missing data into B), it will insert the entire contents of A into B no matter what.

So in the insert I added a WHERE statement to filter "older data" based on the timestamp of when the record was added (insert new records where the timestamp from tableA is greater than the one on tableB is). The table has no unique identifiers so I'm kinda stuck making the filter as safe as it can, basically the only thing I could consider is the timestamp. I've tried some ways around it but all of them have their cons (commented in each code block).

INSERT INTO TABLEB (VALUES)
    SELECT A.VALUES
    FROM A 
    WHERE (SELECT MAX(TIMESTAMP) FROM TABLEB) < (SELECT MAX(TIMESTAMP) FROM TABLEA)

--THIS DOESN'T EVEN WORK
INSERT INTO TABLEB (VALUES)
    SELECT A.VALUES
    FROM A 
    WHERE A.TIMESTAMP BETWEEN (SELECT MAX(TIMESTAMP) FROM TABLEB) AND (SELECT MAX(TIMESTAMP) FROM TABLEA)

--IF RUN TWICE IT WILL RUN ON RECORDS CONTAINING TIMESTAMPS WITH THOSE VALUES
INSERT INTO TABLEB (VALUES)
    SELECT A.VALUES
    FROM A 
    WHERE A.TIMESTAMP BETWEEN (SELECT MAX(TIMESTAMP) FROM TABLEB) AND (SELECT MAX(TIMESTAMP) FROM TABLEA)
      AND A.TIMESTAMP <> (SELECT MAX(TIMESTAMP) FROM TABLEA)

--THIS WILL EFFECTIVELY SKIP RECORDS WITH THE TIMESTAMP FROM TABLEA BUT RUN TWICE WILL DUPLICATES THE ONES WITH TIMESTAMP FROM B

INSERT INTO TABLEB (VALUES)
    SELECT A.VALUES
    FROM A 
    WHERE A.TIMESTAMP BETWEEN (SELECT MAX(TIMESTAMP) FROM TABLEB) AND (SELECT MAX(TIMESTAMP) FROM TABLEA)
      AND A.TIMESTAMP <> (SELECT MAX(TIMESTAMP) FROM TABLEA)
      AND A.TIMESTAMP <> (SELECT MAX(TIMESTAMP) FROM TABLEB)

--THIS WILL SKIP DATA WITH THE TIMESTAMP FROM B

Any help on the logic to make in run ONCE and skip duplicated rows is appreciated.

Upvotes: 1

Views: 84

Answers (3)

Amir Molaei
Amir Molaei

Reputation: 3820

You need to have a version column in table A with the type of timestamp and a corresponding column in table B with the type of varbinary(8). With this assumption, the following code must work:

INSERT INTO TABLEB (VALUES)
    SELECT A.VALUES
    FROM A 
    WHERE A.Version > (SELECT MAX(Version) FROM TABLEB)

Upvotes: 0

Muqadar Ali
Muqadar Ali

Reputation: 87

Inserting all the records on each insert of table A is pretty much costly. You can use "Scope_Identity" command if you are using T-SQL (sql server). You need to add each record in both tables in same call. Add values to table A then add to table B.

Here is an example:

INSERT INTO TABLEA (column-names) 
VALUES (values)

DECLARE @LAST_RECORD_ID INT=0

SELECT @LAST_RECORD_ID = SCOPE_IDENTITY()

INSERT INTO TABLEB (column-names) 

SELECT 
    (column-names)
FROM TABLEA WHERE TABLEA.ID = @LAST_RECORD_ID

Upvotes: 0

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48197

I'm not sure what is your problem, but this should work. You check what was the last one inserted in Table B and then got all the new records.

INSERT INTO TABLEB (VALUES)
    SELECT A.VALUES
    FROM A 
    WHERE TIMESTAMP > (SELECT MAX(TIMESTAMP) FROM TABLEB)

Upvotes: 1

Related Questions