Reputation: 522

INSERT INTO only if dataset doesn't already exist

I have this script in my program. It inserts new datasets for every entry in user_settings into my pricelist table. It works fine.

INSERT INTO [pricelist]
       ([ID]
       ,[plf]
       ,[vdf]
       ,[vdt]
       ,[plID]
       ,[usID])
 SELECT
       NEWID()
       ,5.5
       ,'2017-01-02 00:00:00'
       ,'2027-01-03 00:00:00'
       ,'8020F2FA1C80XXXXXXXXXXXXXXX'
       ,ID
FROM [user_settings]

But if I need to re-run the script, in case run 1 canceled for any reason. I get duplicate entries in table pricelist. How can I avoid this? Thanks in advance.

Upvotes: 0

Answers (3)

GarethD

Reputation: 69769

A simple way is to use NOT EXISTS:

INSERT INTO [pricelist]
       ([ID]
       ,[plf]
       ,[vdf]
       ,[vdt]
       ,[plID]
       ,[usID])
 SELECT
       NEWID()
       ,5.5
       ,'2017-01-02 00:00:00'
       ,'2027-01-03 00:00:00'
       ,'8020F2FA1C80XXXXXXXXXXXXXXX'
       ,ID
FROM [user_settings] AS us
WHERE NOT EXISTS (SELECT 1 FROM [pricelist] AS p WHERE p.UsID = us.ID);

Although, this is not thread safe, and you might still end up with duplicates if you meet a race condition. If something should be unique, then give it a unique constraint, e.g.

ALTER TABLE PriceList ADD CONSTRAINT UQ_PriceList__usID UNIQUE (usID);

This will guarantee no duplicates. The most threadsafe INSERT I know of is to use MERGE WITH (HOLDLOCK):

MERGE [PriceList] WITH (HOLDLOCK) AS p
USING [user_settings]  AS US
    ON us.ID = p.usID
WHEN NOT MATCHED THEN 
    INSERT ([ID], [plf], [vdf], [vdt], [plID], [usID])
    VALUES (NEWID(), 5.5, '2017-01-02 00:00:00', '2027-01-03 00:00:00', 
            '8020F2FA1C80XXXXXXXXXXXXXXX', us.[ID]);

This is still no substitute for creating your constraint though.

N.B. It is not quite clear from your question how you define a duplicate, for brevity I have assumed it is just usID in the table, but if it is more columns, all of the principles outlined above still apply

ADDENDUM

If the above is not working, then you may need to add more conditions to your join to include more columns:

MERGE PriceList WITH (HOLDLOCK) AS p
USING 
(   SELECT  ID = NEWID(), 
            plf = 5.5, 
            vdf = '2017-01-02 00:00:00', 
            vdt = '2027-01-03 00:00:00',
            plID = '8020F2FA1C80XXXXXXXXXXXXXXX',
            usID = us.ID
    FROM    user_settings  AS us
    GROUP BY us.ID -- only needed if us.ID is not unique
) AS us
    ON us.usID = p.usID
    AND us.plf = p.plf
    AND us.vdf = p.vdf
    AND us.vdt = p.vdt
    AND us.plID = p.plID
WHEN NOT MATCHED THEN 
    INSERT (ID, plf, vdf, vdt, plID, usID)
    VALUES (us.ID, us.plf, us.vdf, us.vdt, us.plID, us.usID);

Finally, part of the problem could be existing duplicates, which you can remove using something like:

DELETE  t
FROM    (   SELECT  *, RowNumber = ROW_NUMBER() OVER(PARTITION BY plf, vdf, vdt, plID, usID ORDER BY ID)
            FROM    PriceList
        ) AS t
WHERE   t.RowNumber > 1;

Upvotes: 0

Pரதீப்

Reputation: 93704

You should consider adding a unique constraint on your table to avoid such problems.

Use NOT Exists to avoid inserting duplicate data. Based on your comments, you need to check for all the columns except the ID

INSERT INTO [pricelist]
            ([ID],[plf],[vdf],[vdt],[plID],[usID])
SELECT [ID],[plf],[vdf],[vdt],[plID],[usID]
FROM   (SELECT Newid()                       ID,
               5.5                           AS plf,
               '2017-01-02 00:00:00'         AS vdf,
               '2027-01-03 00:00:00'         AS vdt,
               '8020F2FA1C80XXXXXXXXXXXXXXX' AS plID,
               ID                            AS usID
        FROM   [user_settings]) u
WHERE  NOT EXISTS (SELECT 1
                   FROM   pricelist p
                   WHERE  u.plf = p.plf
                          AND u.vdf = p.vdf
                          AND u.vdt = p.vdt
                          AND u.plID = p.plID
                          AND u.usID = p.usID)

Upvotes: 1

Juan Carlos Oropeza

Reputation: 48197

 INSERT INTO [pricelist]
        SELECT NEWID(), u.*
        FROM [user_settings] u
        LEFT JOIN [pricelist] p
          ON u.[id] = p.[id]
        WHERE p.[id] IS NULL

Upvotes: 0

INSERT INTO only if dataset doesn&#39;t already exist

Answers (3)

Related Questions

INSERT INTO only if dataset doesn't already exist