keynesiancross
keynesiancross

Reputation: 3529

VBA Excel - Performance issue of large Scripting.Dictionary

I built a function that queries a in-memory dictionary to avoid using another function that queried an actual database on a server. The goal was to query the database only once (as the queries were kind of slow) and keep the result in memory for recall. I have about 25k cells in the spreadsheet with the original Database query function, that I'm hoping to replace with this function below.

This has resulted in a very large public global dictionary (about 25k string keys, each key is about 75 characters long), but I'm finding that when I run this function the performance is worse than just querying the actual database on the server. I was hoping someone could suggest some ways to make this perform better...

Public dataDict As New Scripting.Dictionary

Public Function DB_Wrap(database As String, query As String) As Double
    Dim q As ServerDatabase
    Dim key As String
    Dim result As Double

    key = database & "|||" & query

    If dataDict.Exists(key) = False Then
        Set root = New ServerDatabase
        Set q = root.Open(database)

        result = q.total(query)
        dataDict.Add key, result
    Else
        result = dataDict(key)   'its when I run this that the performance isn't great. 
    End If

    DB_Wrap= result
End Function

Upvotes: 0

Views: 2894

Answers (2)

bodgesoc
bodgesoc

Reputation: 321

This isn't really an answer, but became too long to be a comment.

I have a macro that used to work acceptably, but since we switched to O365 seems very slow. I am aware that correlation does not always imply causation. Adding items to the dictionary and testing for duplicates is OK, about 30 seconds to add the 88k keys, but then writing all the unique pairs to a new sheet is taking an awfully long time.

i = 0
T = Timer()
For Each K In L
    DoEvents
    Set D = D.Offset(1, 0)
    i = i + 1: If i Mod 1000 = 0 Then Debug.Print Timer() - T
Next

Gives 0.20, 0.41, 0.61 ... and completes in 18 seconds.

If we actually access the value of K:

For Each K In L
    DoEvents
    D.Value = K
    Set D = D.Offset(1, 0)
    i = i + 1: If i Mod 1000 = 0 Then Debug.Print Timer() - T
Next

Then the progress becomes steadily slower and I am too impatient to allow it to finish: 104, 376, 810, 1424 seconds so heading towards 15 hours to completion and suggesting 0.6 seconds to access each key.

And if we search for the value using the key:

For Each K In L
    DoEvents
    D.Value = K
    D.Offset(0, 1).Value = L(K)
    Set D = D.Offset(1, 0)
    i = i + 1: If i Mod 1000 = 0 Then Debug.Print Timer() - T
Next

And the timestamps every 1000 entries become 121, 417, 909, 1561 suggesting that, surprisingly, it isn't finding the value for the key that takes the time, but accessing the key itself in this case, even by iteration.

I also tried accessing the key and value by index, but this was no better:

For i = 0 To L.Count
    DoEvents
    D.Value = L.Keys(i)
    D.Offset(0, 1).Value = L.Items(i)
    Set D = D.Offset(1, 0)
    If i Mod 1000 = 0 Then Debug.Print Timer() - T
Next

Basically I am going to have to abandon the idea of using a dictionary to detect and remove duplicate entries (which is the only reason that I was using it)

Upvotes: 3

Absinthe
Absinthe

Reputation: 3391

A few things to try:

You're creating a new string with each call via key = database & "|||" & query and you have the overhead of passing 'database' and 'query' each time (albeit they are passed by reference, but still, it adds up) and you're allocating memory for strings each time. Is there any way you can have these stored in one instance, rather than create them every time you call the function? Perhaps you could create all the required strings at the same time and pass them, rather than create them each time in the function?

I'd imagine Set q = root.Open(database) is the slow part here so if you can get all the keys in one instance it should help.

Can you make the keys shorter? The more chars that are processed the slower it will be.

Rather than declare dataDict As New Scripting.Dictionary declare it in your active code:

Dim dataDict As Scripting.Dictionary

Set dataDict = New Scripting.Dictionary

If you declare with the New keyword your code will often go to check if the object exists before carrying out your orders.

Lastly, how does the data get from the server into a dictionary? Maybe an array would be better?

Upvotes: 1

Related Questions