GreatKing
GreatKing

Reputation: 63

Access INSERT statement using INNER JOIN with NOT EXISTS gives wrong result

I have an Access .mdb database. There are two tables with identical structures: Prices and tmpPrices. Each table has three columns: [As of Date std] (Date/Time), Price (double), CUSIP (text, 255 chars). tmpPrices contains new records to be added to Prices. I have two SQL queries to update Prices from tmpPrices, one uses a WHERE join and the other uses an INNER JOIN.

The following version A works correctly:

INSERT INTO [Prices] SELECT * FROM [tmpPrices] WHERE NOT EXISTS 
(SELECT * from [Prices]
  WHERE ([Prices].[As of date std] = [tmpPrices].[As of date std])
  AND ([Prices].CUSIP = [tmpPrices].CUSIP));

whereas this version B does not work:

INSERT INTO [Prices] SELECT * FROM [tmpPrices] WHERE NOT EXISTS
(SELECT  [Prices].* FROM [Prices] INNER JOIN [tmpPrices] ON 
   ([Prices].[As of Date std] = [tmpPrices].[As of Date std]) 
   AND ([Prices].CUSIP = [tmpPrices].CUSIP));

Both sub-queries give the same result: a subset of records from tmpPrices that already exist in Prices.

It would not matter too much using version A were it not that I have to update other tables and one other table takes around 45 minutes to update using version A but only a fraction of that using version B and it appears to work correctly. So I would like to understand what is going on here.

Upvotes: 4

Views: 12448

Answers (3)

HansUp
HansUp

Reputation: 97101

A LEFT JOIN is definitely the way to go, as @devsh suggested, but you don't need to use a subquery. I built this in Access' query designer, and it works in Access 2003 with the tables you described. I did rename [As of date std] to eliminate spaces.

INSERT INTO Prices ( As_of_Date_std, Price, CUSIP)
SELECT t.As_of_Date_std, t.Price, t.CUSIP
FROM
    tmpPrices AS t
    LEFT JOIN Prices AS p
    ON (t.As_of_Date_std = p.As_of_Date_std) AND (t.CUSIP = p.CUSIP)
WHERE (((p.As_of_Date_std) Is Null));

Upvotes: 4

invertedSpear
invertedSpear

Reputation: 11054

A is taking so long because it is basically doing a full table scan of tmpPrices for every record in prices since there is no actual join being defined in the sub query.

Lets see if we can get you a fast query that gives you what you need.

tmpPrices will need an autonumber field for this to work and I suggest making it the pk for tmpPrices. Also a unique key based on the 3 other fields to prevent duplicates in tmpPrices (pretty sure you can do that in Access).

INSERT INTO [Prices] ([As of Date std],[Price],[CUSIP])
SELECT [As of Date std],[Price],[CUSIP] from tmpPrices where autoNumberID not in(
  SELECT autonumberID
  FROM tmpPrices 
  innerJoin prices on [Prices].[As of Date std] = [tmpPrices].[As of Date std] 
   AND [Prices].CUSIP = [tmpPrices].CUSIP
) query1

It will also help to keep things fast if you purge any records in tmpPrices each time you update prices with this.

Upvotes: 0

devsh
devsh

Reputation: 21

The second query does not work because there is no link between the select part of the query (from which you are trying to insert) and the contained sub-query. If you do want to use that format for insertion, you can do this:

INSERT INTO [Prices]
SELECT [As of Date], Price, CUSIP
FROM 
(SELECT tmpPrices.[As of Date std], tmpPrices.[Price], tmpPrices.[CUSIP]
FROM tmpPrices LEFT JOIN Prices ON
([Prices].[As of date std] = [tmpPrices].[As of date std]) AND ([Prices].CUSIP = [tmpPrices].CUSIP)) WHERE Prices.[As of date std] is null)

Upvotes: 1

Related Questions