emilk
emilk

Reputation: 85

Fuzzy string matching Excel

I am currently in need of a fuzzy string matching algorithm. I found one VBA code from this link given here: Fuzzy Matching.

Function FuzzyFind(lookup_value As String, tbl_array As Range) As String
Dim i As Integer, str As String, Value As String
Dim a As Integer, b As Integer, cell As Variant
For Each cell In tbl_array
  str = cell
  For i = 1 To Len(lookup_value)
    If InStr(cell, Mid(lookup_value, i, 1)) > 0 Then
      a = a + 1
      cell = Mid(cell, 1, InStr(cell, Mid(lookup_value, i, 1)) - 1) & Mid(cell, InStr(cell, Mid(lookup_value, i, 1)) + 1, 9999)
    End If
  Next i
  a = a - Len(cell)
  If a > b Then
    b = a
    Value = str
  End If
  a = 0
Next cell
FuzzyFind = Value
End Function

However, this gives you the match no matter how distant the "correct answer" is. Is there any way to implement that the functions gives "N/A" if it is, let's say, 4 characters or more away from the original string?

Upvotes: 1

Views: 2827

Answers (1)

Simon
Simon

Reputation: 1375

Have a try with this and see if it's what you're after. It's loosely based off the one you have there.

EDIT: Did some more testing and found my original version not quite right. This should be better but it's near impossible to get something like this to work for every eventuality.

Function FuzzyFind(lookup_value As String, tbl_array As Range) As String

Dim i As Long, cell As Range, Matches As Long, LengthError As Long, _
FuzzyValue As String, FuzzyMatch As Long, L As String, C As String, MultipleReturns As Boolean

For Each cell In tbl_array
    Matches = 0
    If cell.Value <> "" Then
        L = UCase(lookup_value)
        C = UCase(cell.Value)
        For i = 1 To Len(L)
            If InStr(Mid(L, i, Len(L) - i), Mid(C, i, 1)) > 0 Then
                Matches = Matches + 1
            Else
                Matches = Matches - 1
            End If
        Next i

        LengthError = Abs(Len(C) - Len(L))
        Matches = Matches - LengthError
        If Len(L) - Matches <= 4 And Matches >= FuzzyMatch Then
            If Matches = FuzzyMatch Then
                MultipleReturns = True
                Exit For
            End If
            FuzzyValue = cell.Value
            FuzzyMatch = Matches
        End If
    End If
Next
If FuzzyValue <> "" Then
    If MultipleReturns = True Then
        FuzzyFind = "N/A (Multiple Returns)"
    Else
        FuzzyFind = FuzzyValue
    End If
Else
    FuzzyFind = "N/A"
End If

End Function

Upvotes: 2

Related Questions