bruno-martins
bruno-martins

Reputation: 131

Comparing elements of two strings

I've developed the following code to compare two cells (strings) in columns A and D and write down the D cell value in the corresponding B cell if a partial match is found.

Sub CompareAndGuess()
Dim strLen, aux As Integer
Dim max1, max2 As Long
Dim str As String

Range("A1").Select
Selection.End(xlDown).Select
max1 = ActiveCell.Row
Range("D1").Select
Selection.End(xlDown).Select
max2 = ActiveCell.Row

For a = 2 To max1
    str = Cells(a, 1)
    str = StrConv(str, vbUpperCase)
    strLen = Len(str)
    aux = strLen

    For l = 3 To strLen
         For d = 2 To max2
             If Cells(d, 4) = Left(str, aux) Then
                Cells(a, 2) = Cells(d, 4)
                Exit For
            ElseIf Cells(d, 4) = Right(str, aux) Then
                Cells(a, 2) = Cells(d, 4)
                Exit For
            End If
        Next d

        aux = aux - 1
        If Cells(a, 2) <> "" Then
            Exit For
        End If
    Next l
     Cells(a, 2).Select
Next a
End Sub

Can anyone help me find where is the problem because when I run it the code only guesses right one row out of 50 whereas it should match at least 40 or so.

Please, I really can't find the error in there. Feel free to propose another solution to my problem if you want.

A sample of the data I'm analysing is: Names with Typos:-

Jatiuca
Pajuara
Poco
Santa Luzia
Pajucara
Domingos Acacio
Jaragua
Stella Maris
P Verde
Tabuleiro dos Martin
Gruta Lourdes
Brasilia
Centro Historico
Monumento
Tabuleiro dos Martins

Names with Typos to be searched on this list:-

JARAGUÁ
TABULEIRO DO MARTINS
CENTRO
BRASÍLIA
CACIMBAS
JATIÚCA
CAITITUS
PAJUÇARA
CANAÃ
PONTA VERDE
CANAFÍSTULA
POÇO
CAPIATÃ
CAVACO
SANTA LÚCIA

Upvotes: 3

Views: 1617

Answers (3)

user2140261
user2140261

Reputation: 8003

This is absolutely UNTESTED

I will rewrite tomorrow, and clean it up but this is the basic way to REALLY know you are matching the correct words. It may take a little longer, and i will speed it up A LOT tomorrow but for now this is the closet way to test the words for validity

'Go through all possibly typod words
For each rngTestCell in Range("yourlist")

   'For each possibly typod word test if against every correct value
    For each rngCorrectedValue in Range("ListOfCorrectValues")

        'start by testing length to weed out most values quick
        'Test any words that are within 3 letters of each other, can be less
        'could add a tet for first and last letters match also before starting 
        'to match every letter also, just a top level weeding of words
        If (Len(rngTestCell) - Len(rngCorrectedValue)) < 3 Then

           'loop each letter in the words for match keep a record of how many are matched
           for i = 1 to Len(rngTestCell)

                If rngTestCell.Character(i,1) = rngCorrectedValue.Characters(i,1) Then
                     NumberOfMatches = NumberOfMatches + 1
                End If

            next i

            'if enough of the letters match replace the word, this will need updating because
            'i feel using a ratio of more then 10% of the words match then replace
            'but for now if more then 2 letters don't match then it isn't a match
            If (Len(rngTestCell) - NumberOfMatches) > 2 Then 'Less then 2 letters are different
                rngTestCell.Offset(,1).Value = rngCorrectedValue.Value
                Exit Loop
            End If

        End If

    Next rngCorrectedValues

Next rngTestCell 

Upvotes: 0

Graham Anderson
Graham Anderson

Reputation: 1239

I'm glad you solved the problem yourself using the InStr function. The reason your code wasn't working well was because you were comparing shortened versions of the names with full length versions. Modifying your earlier code with the following would have found a lot more matches.

            If Left(Cells(d, 4), aux) = Left(str, aux) Then
                Cells(a, 2) = Cells(d, 4)
                Exit For
            ElseIf Right(Cells(d, 4), aux) = Right(str, aux) Then
                Cells(a, 2) = Cells(d, 4)
                Exit For
            End If

Upvotes: 0

bruno-martins
bruno-martins

Reputation: 131

I've found the right way to do it with everyone's help. Here it is:

        If InStr(1, Cells(d, 4), Left(str, aux)) = 1 Then
            Cells(a, 2) = Cells(d, 4)
            Exit For
        ElseIf InStr(Cells(d, 4), Right(str, aux)) + strLen - aux = strLen Then
            Cells(a, 2) = Cells(d, 4)
            Exit For
        End If

Upvotes: 3

Related Questions