Reputation: 2842
say, i have a string array of 50000 elements. Searching the array using For Next
is so slow for such a huge array. Is there any fast way to search?
Note: Using join
& instr
we can search for a string in an array, but this method is no good as i can not find out the element number
Note: the array is unsorted. And i'm looking for substrings
Upvotes: 3
Views: 28446
Reputation: 1
An improved version of John's code (if you search a string, it finds the first occourence even though the string is not completely what you are searching, ex: you search "and", your array is "me and you","just","and" it returns 1 instead of 3)
Function IndexOf(ByRef arr() As String, ByVal str As String) As Integer
Dim tuttook As Boolean
Dim joinedStr As String
Dim strIndex As Integer
strIndex = 0
tuttook = False
joinedStr = "|" & Join(arr, "|")
While tuttook = False
strIndex = InStr(strIndex + 1, joinedStr, str)
If strIndex = 0 Then
IndexOf = -1
Exit Function
Else
If Mid(joinedStr, strIndex - 1, 1) = "|" And Mid(joinedStr, strIndex + Len(str), 1) = "|" Then tuttook = True
End If
Wend
joinedStr = Mid(joinedStr, 1, strIndex - 1)
IndexOf = UBound(Split(joinedStr, "|")) - 1
End Function
Upvotes: 0
Reputation: 1031
Well i used Joins
and Splits
, didn't do any benchmark
though:
Function IndexOf(ByRef arr() As String, ByVal str As String) As Integer
Dim joinedStr As String
Dim strIndex As Integer
joinedStr = "|" & Join(arr, "|")
strIndex = InStr(1, joinedStr, str)
If strIndex = 0 Then
IndexOf = -1
Exit Function
End If
joinedStr = Mid(joinedStr, 1, strIndex - 1)
IndexOf = UBound(Split(joinedStr, "|")) - 1
End Function
Upvotes: 0
Reputation: 297
here's a fast way to return the number of substring occurrences. Hope it helps!
Option Explicit
Option Compare Binary
Option Base 0
DefLng A-Z
Sub TestSubStringOccurence()
Dim GrabRangeArray() As Variant
Dim i As Long
Dim L As Long
Dim RunTime As Double
Dim SubStringCounter As Long
Dim J As Long
Dim InStrPosition As Long
Dim Ws As Excel.Worksheet
Set Ws = ThisWorkbook.Sheets("Sheet1")
RunTime = Timer
With Ws
For i = 1 To 50000
If i Mod 2 = 0 Then .Cells(i, 1).Value2 = "1 abcdef 2 abcdef 3 abcdef 4 abcdef 5 abcdef" _
Else .Cells(i, 1).Value2 = i Next i
GrabRangeArray = .Range("a1:a50000").Value
End With
RunTime = Timer
'returns number of substring occurrences
For i = 1 To UBound(GrabRangeArray, 1)
InStrPosition = 1
Do
InStrPosition = InStr(InStrPosition, GrabRangeArray(i, 1), "abcdef", vbBinaryCompare)
If InStrPosition <> 0 Then
SubStringCounter = SubStringCounter + 1
InStrPosition = InStrPosition + 6
End If
Loop Until InStrPosition = 0
Next i
Debug.Print "Runtime: " & Timer - RunTime & ", ""abcdef"" occurences: " & SubStringCounter
End Sub
here's a fast way to test if a substring exists, but does not return the number of substring occurrences.
Option Explicit
Option Compare Binary
Option Base 0
DefLng A-Z
Sub TestSubStringOccurence()
Dim GrabRangeArray() As Variant
Dim I As Long
Dim L As Long
Dim RunTime As Double
Dim SubStringCounter As Long
Dim J As Long
Dim InStrPosition As Long
Dim Ws As Excel.Worksheet
Const ConstABCDEFString As String = "abcdef"
Dim B As Boolean
Set Ws = ThisWorkbook.Sheets("Sheet1")
RunTime = Timer
ReDim GrabRangeArray(0 To 49999)
With Ws
For I = 1 To 50000
If I Mod 2 = 0 Then GrabRangeArray(I - 1) = "1 abcdef 2 abcdef 3 abcdef 4 abcdef 5 abcdef" _
Else GrabRangeArray(I - 1) = I - 1
Next I
.Range("a1:a50000").Value = Application.Transpose(GrabRangeArray)
End With
RunTime = Timer
For I = 1 To UBound(GrabRangeArray, 1)
If InStrB(1, GrabRangeArray(I), ConstABCDEFString, vbBinaryCompare) Then _
SubStringCounter = SubStringCounter + 1
Next I
Debug.Print "Runtime: " & Timer - RunTime & ", ""abcdef"" occurences: " & SubStringCounter
End Sub
Upvotes: 0
Reputation: 24227
This is an expansion of your idea to use Join
and InStr
:
Sub TestArraySearch()
Dim A(4) As String
A(0) = "First"
A(1) = "Second"
A(2) = "Third"
A(3) = "Fourth"
A(4) = "Fifth"
Debug.Print FastArraySearch(A, "Fi")
Debug.Print FastArraySearch(A, "o")
Debug.Print FastArraySearch(A, "hird")
Debug.Print FastArraySearch(A, "Fou")
Debug.Print FastArraySearch(A, "ndTh")
Debug.Print FastArraySearch(A, "fth")
End Sub
Function FastArraySearch(SearchArray As Variant,SearchPhrase As String) As String
Dim Pos As Long, i As Long, NumCharsProcessed As Long, Txt As String
Pos = InStr(Join(SearchArray, "§"), SearchPhrase)
If Pos > 0 Then
For i = LBound(SearchArray) To UBound(SearchArray)
NumCharsProcessed = NumCharsProcessed + Len(SearchArray(i)) + 1
If NumCharsProcessed >= Pos Then
FastArraySearch = SearchArray(i)
Exit Function
End If
Next i
End If
End Function
I did not benchmark it, but it should be quicker than doing a separate search each time through the loop. It searches once, then just adds up the string lengths until it gets to where the match was. Because the length of the string is stored before any of the characters in the string, the Len
function is highly optimized.
If this performance is still unacceptable I think you will need to find a different data structure than an array (eg, a disconnected recordset, as @Remou suggested).
Upvotes: 3
Reputation: 5689
The number one way of speeding up any array indexing operation in VB6 is to recompile the component with the following option:
Now your array indexing should be as fast as the equivalent C/C++ operation.
The only issue is that you should ensure that your code never refers to indexes outside its normal array bounds. Previously, you would get a VB runtime error. After this, you might get Access Violation instead.
Upvotes: 1
Reputation: 9726
Try using the Filter(InputStrings, Value[, Include[, Compare]]) function. It returns an array of the matching strings.
The complete syntax can be found on MSDN
Upvotes: 4
Reputation: 33145
Can you show the code you're using at how long it takes? Also, how long is too long? This code reads in 50,000 strings and finds the 275 that contain substring in just over 300 milliseconds.
Sub testarr()
Dim vaArr As Variant
Dim i As Long
Dim dTime As Double
Dim lCnt As Long
dTime = Timer
vaArr = Sheet1.Range("A1:A50000")
For i = LBound(vaArr, 1) To UBound(vaArr, 1)
If InStr(1, vaArr(i, 1), "erez") > 0 Then
lCnt = lCnt + 1
Debug.Print i, vaArr(i, 1)
End If
Next i
Debug.Print Timer - dTime
Debug.Print lCnt
End Sub
Upvotes: 3