user2961053
user2961053

Reputation:

VBA ADO Recordset speed differs

I am hit by a very interesting query. I am using Excel to loop data in an access database (ADO). There are 2 tables and all fields are the same in both. The only difference is 1 has around 1200 records and the other 27,000 records. The iteration through the recordset does exactly the same thing for both tables however the iteration for the table with 1200 records loops around 23 records per second whereas when I run for the second (larger) table the iteration runs around 3 records per second. I know the recordset with more data fills the recordset with more records but as the iteration is doing the exact same thing i.e. checking some fields etc. then I wonder why 1 is running much slower than the other. Any thoughts? Thank you. btw, both the Excel file and Access database are on my local machine.

There is a large amount of code so I have just taken a relevant snippet below.

    qString = "SELECT DISTINCT [myField] FROM [myTable] WHERE [myDate] BETWEEN #" & FromDate & 
    FromTime & "# AND #" & ToDate & ToTime & "# ORDER BY [myField] Asc;"'
                                      
    Debug.Print qString
                    
    Set IceRecordset = .OpenRS_IceData1(qString, adUseClient, adOpenStatic, adLockReadOnly)
                   
    With IceRecordset      
                
    Do While .EOF = False 'Loope each entry in the RecordSet
    'Do all the stuff on the loop //

       .MoveNext
    Loop

    End With

Upvotes: 0

Views: 760

Answers (1)

Albert D. Kallal
Albert D. Kallal

Reputation: 49059

The reason for the vast difference is of course the distinct keyword. That has to go and check the other rows to ensure no duplicates. That is NOT linear, but a exponential growth.

So, with 10 rows you have a lot of compares, If you can live without and remove the distinct on that query, then speed difference (records per time unit) would not be all that much. But distinct is a VERY different and complex problem.

If you can get rid of the distinct, then that may well help and fix this.

Can you try a group by PK, or in fact can you live without the distinct, then performance should not be all that different. You could also (as you note) check for a index on the date column.

It is possible that your processing loop does more row processing and thus once again, the growth in processing here is not a linier problem, and thus more records will take progressive's longer. But, try the indexing, and try removal of distinct if you can. (and I would test with the processing loop skipped - just to ensure that your original pull of data is not slow. If the original pull of the larger dataset runs fast, then the the processing loop for each row is your problem.

Upvotes: 0

Related Questions