stitch70
stitch70

Reputation: 143

SELECT INTO query with NOT IN subquery takes a long time / hangs

The issue is that this query hangs or has infinite records which I am not aware on how to fix using MS ACCESS:

Expected User Input:

User input Start Date: 1/15/2015
User input End Date: 11/15/2015
User input Upper Data Threshold in kB: 50

Source Table:

[Master] Table in Access:
Invc Date  Mobile Nbr     PktDtVol   
---------  ----------     --------   
1/15/15   647-409-8206    48kB
2/15/15   647-409-8206    33kB
3/15/15   647-409-8206    8000kB
4/15/15   647-409-8206    20kB
5/15/15   647-409-8206    10kB
6/15/15   647-409-8206    0kB
7/15/15   718-500-2311    3kB
8/15/15   718-500-2311    45kB
9/15/15   718-500-2311    25kB
10/15/15  514-300-3311    33kB
11/15/15  514-300-3311    20kB

Output in [Temp_Table]:

Invc Date  Mobile Nbr     PktDtVol    Difference in Days 
---------  ----------     --------   -------------------
7/15/15    718-500-2311    3kB             304
8/15/15    718-500-2311    45kB            304
9/15/15    718-500-2311    25kB            304
10/15/15   514-300-3311    33kB            304
11/15/15   514-300-3311    20kB            304

Accepted SQL Solution to generate the above output:

PARAMETERS [Start Date] DateTime, [End Date] DateTime, [Upper Bound Usage in KB] IEEEDouble; 
SELECT m.[Invc Date], m.PktDtVol, m.[Mobile Nbr], DateDiff("d",[Start Date],[End Date]) AS [Difference in days] 
INTO Temp_Table FROM Master AS m 
WHERE (m.[Invc Date]>=[Start Date] And m.[Invc Date])<=[End Date] AND m.[Mobile Nbr] NOT IN 
(SELECT q.[Mobile Nbr] FROM Master AS q WHERE (q.PktDtVol>=[Upper Bound Usage in KB]));

From here, I've attempted to create an index to optimize the query by modifying the table through another SQL statement, but does not work:

 CREATE INDEX Index2 ON Master([Ttl Charges])

The query works fine with no hanging with 10 records in the source table with multiple records with the expected output. But the issue arises when there is 56,000 records in the source table with multiple records.

Upvotes: 3

Views: 173

Answers (1)

Gord Thompson
Gord Thompson

Reputation: 123809

An index on [Ttl Charges] isn't going to help you any, but an index on [PktDtVol] will. I just did a test with 10,000 rows and the lack of an index on [PktDtVol] was definitely a performance bottleneck:

Indexes:
  none
Time:
  170 seconds (just under 3 minutes)

Indexes:
  [Invc Date]
  [Mobile Number]
Time:
  (same as before)

Indexes:
  [Invc Date]
  [Mobile Number]
  [PktDtVol]
Time:
  36 seconds

For an additional performance boost, you can reformulate the query to use a LEFT JOIN instead of a NOT IN clause with a subquery, as you mentioned in your comment:

PARAMETERS [Start Date] DateTime, [End Date] DateTime, [Upper Bound Usage in KB] IEEEDouble; 
SELECT 
    [Master].[Invc Date], 
    [Master].PktDtVol, 
    [Master].[Mobile Nbr], 
    DateDiff("d",[Start Date],[End Date]) AS [Difference in days] 
INTO Temp_Table 
FROM 
    [Master] 
    LEFT OUTER JOIN 
    (
        SELECT DISTINCT q.[Mobile Nbr] FROM Master AS q 
        WHERE (q.PktDtVol>=[Upper Bound Usage in KB])
    ) s 
        ON [Master].[Mobile Nbr] = s.[Mobile Nbr] 
WHERE 
    [Master].[Invc Date] >= [Start Date]
        AND [Master].[Invc Date] <= [End Date]
        AND s.[Mobile Nbr] IS NULL;

Upvotes: 3

Related Questions