Mythikos
Mythikos

Reputation: 131

Why is this access query taking so long?

I wrote an access client to do comparisons against two excel files. It loads the two excel files that are being compared into temporary tables and evaluates them based on the two queries show below.

There are two queries because sometimes one of the excel files will only have one name column. Basically the user inputs the name of the columns being compared and we change the query based on that.

The first query, cQueryFull, works perfectly and very fast (over 100k records in just a few seconds). The second query, cQueryPart, works as intended (in terms of comparison) but has never completed on tables with more than 5,000ish records. It ends up hanging for hours and I am forced to close the program.

I don't understand why one query is so much faster than the other and I was hoping someone might be able to help me figure it out and possibly fix the second query. The part of my access client that is creating the query is below:

If chkOneColumn.Value = 0 Then
    ' Construct Comparison Query
    qString = "SELECT OriginalFile." & txtOriginalFirst.Value & " as OriginalFirstName, OriginalFile." & txtOriginalMiddle.Value & " as OriginalMiddleName, OriginalFile." & txtOriginalLast.Value & " as OriginalLastName, WorkingFile." & txtWorkingFirst.Value & " as WorkingFirstName, WorkingFile." & txtWorkingMiddle.Value & " as WorkingMiddleName, WorkingFile." & txtWorkingLast.Value & " as WorkingLastName " _
            + "FROM OriginalFile, WorkingFile " _
            + "WHERE (OriginalFile." & txtOriginalFirst.Value & " not like WorkingFile." & txtWorkingFirst.Value & " or OriginalFile." & txtOriginalMiddle.Value & " not like WorkingFile." & txtWorkingMiddle.Value & " or OriginalFile." & txtOriginalLast.Value & " not like WorkingFile." & txtWorkingLast.Value & ") " _
            + "and OriginalFile." & txtOriginalAddress.Value & " = WorkingFile." & txtWorkingAddress.Value & " " _
            + "and OriginalFile." & txtOriginalDOB.Value & " = WorkingFile." & txtWorkingDOB.Value & " "

    ' Open the record set
    Set db = CurrentDb
    Set qd = db.CreateQueryDef("cQueryFull")
    With qd
        .ReturnsRecords = True
        .sql = qString
    End With
    DoCmd.OpenQuery "cQueryFull"
ElseIf chkOneColumn.Value = -1 Then
    ' Construct Comparison Query
    qString = "SELECT OriginalFile." & txtOriginalFirst.Value & " as OriginalName, IIF(WorkingFile." & txtWorkingFirst.Value & " is null, '', WorkingFile." & txtWorkingFirst.Value & ") + IIF(WorkingFile." & txtWorkingMiddle.Value & " is null, '', ' '+WorkingFile." & txtWorkingMiddle.Value & ") + IIF(WorkingFile." & txtWorkingLast.Value & " is null, '', ' '+WorkingFile." & txtWorkingLast.Value & ") as WorkingName " _
            + "FROM OriginalFile, WorkingFile " _
            + "WHERE (OriginalFile." & txtOriginalFirst.Value & " not like '*'+WorkingFile." & txtWorkingFirst.Value & "+'*' or OriginalFile." & txtOriginalFirst.Value & " not like '*'+WorkingFile." & txtWorkingMiddle.Value & "+'*' or OriginalFile." & txtOriginalFirst.Value & " not like '*'+WorkingFile." & txtWorkingMiddle.Value & "+'*') " _
            + "and OriginalFile." & txtOriginalAddress.Value & " like WorkingFile." & txtWorkingAddress.Value + " " _
            + "and OriginalFile." & txtOriginalDOB.Value & " like WorkingFile." & txtWorkingDOB.Value & " " _

    ' Open the record set
    Set db = CurrentDb
    Set qd = db.CreateQueryDef("cQueryPart")
    With qd
        .ReturnsRecords = True
        .sql = qString
    End With
    DoCmd.OpenQuery "cQueryPart"
End If

Can anyone identify the problem with my query? In case it matters, I have already tried indexing the tables before the query is built and executed. Any help would be greatly appreciated!

Upvotes: 1

Views: 1002

Answers (1)

jleach
jleach

Reputation: 7790

It's hard to tell, but I suspect the problem is with the cross joins and the amount of predicates (and type of predicates) in the WHERE clause(s).

Joining two tables like you're doing tends to create a very large set that the WHERE clause will then have to run through. Furthermore, the LIKE operator in a JET/ACE query is probably the slowest comparison operator that there is. Especially LIKE with a leading wildcard (*).

Sometimes there's just no getting around it, but sometimes it's actually faster to load pre-queried portions into (yet another) temp table and run further queries against that data.

Is there any way you can simplify your WHERE clause, or identify the predicates in distinct batches in such a way that you can run a more straighforward query first, then further process those results? (I suggest possibly writing to temp tables and further querying because subqueries are optimized and don't necessarily guarantee that the "sql logic" you write it as will be how it's actually run).

Upvotes: 1

Related Questions