Seamus
Seamus

Reputation: 39

How do I fuzzy match just adjacent cells?

I have a row of 10,000 names in two corresponding columns, 10,000 in each. Each cell in Column A corresponds to the adjacent cell in Column B. I want to do a fuzzy match and get a compatibility score on all of them just with the adjacent cell. I do not want it to search entire column versus entire column, just adjacent cells, which I don't seem to be able to do with the Fuzzy Match Excel add in, ideas?

Example:

Column A:       Column B:        Value:
Apple           Aplle            80%
Banana          Banana           100%
Orange          Ornge            85%   

Upvotes: 3

Views: 3067

Answers (2)

Kyle Deer
Kyle Deer

Reputation: 121

Import this VBA module: https://github.com/kyledeer-32/vba_fuzzymatching

It contains several Fuzzy Matching UDFs - the one your specifically looking for is =String_Similarity. It will give you the match percentage between two strings. You can also use =Fuzzy_Match for matching one string (one cell) to an array of cells (one column).

Here are my results using applying this VBA module to your example data:

enter image description here

Here is a formula view:

enter image description here

Note: after importing this module, you will need to enable the "Microsoft Scripting Runtime" library in the Visual Basic Editor Window it to run. Steps to do this (takes less than a minute):

From Excel Workbook:

  • Select Developer tab on ribbon
  • Select Visual Basic
  • Select Tools on the Toolbar
  • Select References
  • Scroll down until you see Microsoft Scripting Runtime, then check the box
  • Press OK

Upvotes: 2

PankajR
PankajR

Reputation: 407

Well, I don't know about Fuzzy Match Addin but you can accomplish similar to your requirement using UDF.

Something like this based on your sample data

Function FuzzyComparision(String1 As String, String2 As String) As Double
Dim intStringLength As Integer
Dim dblScore As Double
Dim dblUnitScore As Double
Dim intCounter As Integer

intStringLength = WorksheetFunction.Max(Len(String1), Len(String2))
dblUnitScore = 1 / intStringLength
dblScore = 0#

For intCounter = 1 To intStringLength
    If Mid(UCase(String1), intCounter, 1) = Mid(UCase(String2), intCounter, 1) Then
        dblScore = dblScore + dblUnitScore
    Else
        If Len(String1) <> Len(String2) And intCounter < intStringLength Then
            If Mid(UCase(String1), intCounter + 1, 1) = Mid(UCase(String2), intCounter, 1) Then
                dblScore = dblScore + dblUnitScore
            End If
        End If
    End If
Next

FuzzyComparision = dblScore
End Function

Of course the comparison can be further improvised as required. Let me know if that helps

Edit: as Grade Bacon pointed (thanks for that observation), this does not work as expected for missing letters. I have tweaked the function a bit to do a case insensitive comparison and compensate for missing letters as in example.

Even though it works, utility depends on acceptable score deviation +/-. OP may shed some light.

Upvotes: 2

Related Questions