Reputation: 131
I've developed the following code to compare two cells (strings) in columns A and D and write down the D cell value in the corresponding B cell if a partial match is found.
Sub CompareAndGuess()
Dim strLen, aux As Integer
Dim max1, max2 As Long
Dim str As String
Range("A1").Select
Selection.End(xlDown).Select
max1 = ActiveCell.Row
Range("D1").Select
Selection.End(xlDown).Select
max2 = ActiveCell.Row
For a = 2 To max1
str = Cells(a, 1)
str = StrConv(str, vbUpperCase)
strLen = Len(str)
aux = strLen
For l = 3 To strLen
For d = 2 To max2
If Cells(d, 4) = Left(str, aux) Then
Cells(a, 2) = Cells(d, 4)
Exit For
ElseIf Cells(d, 4) = Right(str, aux) Then
Cells(a, 2) = Cells(d, 4)
Exit For
End If
Next d
aux = aux - 1
If Cells(a, 2) <> "" Then
Exit For
End If
Next l
Cells(a, 2).Select
Next a
End Sub
Can anyone help me find where is the problem because when I run it the code only guesses right one row out of 50 whereas it should match at least 40 or so.
Please, I really can't find the error in there. Feel free to propose another solution to my problem if you want.
A sample of the data I'm analysing is: Names with Typos:-
Jatiuca
Pajuara
Poco
Santa Luzia
Pajucara
Domingos Acacio
Jaragua
Stella Maris
P Verde
Tabuleiro dos Martin
Gruta Lourdes
Brasilia
Centro Historico
Monumento
Tabuleiro dos Martins
Names with Typos to be searched on this list:-
JARAGUÁ
TABULEIRO DO MARTINS
CENTRO
BRASÍLIA
CACIMBAS
JATIÚCA
CAITITUS
PAJUÇARA
CANAÃ
PONTA VERDE
CANAFÍSTULA
POÇO
CAPIATÃ
CAVACO
SANTA LÚCIA
Upvotes: 3
Views: 1617
Reputation: 8003
This is absolutely UNTESTED
I will rewrite tomorrow, and clean it up but this is the basic way to REALLY know you are matching the correct words. It may take a little longer, and i will speed it up A LOT tomorrow but for now this is the closet way to test the words for validity
'Go through all possibly typod words
For each rngTestCell in Range("yourlist")
'For each possibly typod word test if against every correct value
For each rngCorrectedValue in Range("ListOfCorrectValues")
'start by testing length to weed out most values quick
'Test any words that are within 3 letters of each other, can be less
'could add a tet for first and last letters match also before starting
'to match every letter also, just a top level weeding of words
If (Len(rngTestCell) - Len(rngCorrectedValue)) < 3 Then
'loop each letter in the words for match keep a record of how many are matched
for i = 1 to Len(rngTestCell)
If rngTestCell.Character(i,1) = rngCorrectedValue.Characters(i,1) Then
NumberOfMatches = NumberOfMatches + 1
End If
next i
'if enough of the letters match replace the word, this will need updating because
'i feel using a ratio of more then 10% of the words match then replace
'but for now if more then 2 letters don't match then it isn't a match
If (Len(rngTestCell) - NumberOfMatches) > 2 Then 'Less then 2 letters are different
rngTestCell.Offset(,1).Value = rngCorrectedValue.Value
Exit Loop
End If
End If
Next rngCorrectedValues
Next rngTestCell
Upvotes: 0
Reputation: 1239
I'm glad you solved the problem yourself using the InStr function. The reason your code wasn't working well was because you were comparing shortened versions of the names with full length versions. Modifying your earlier code with the following would have found a lot more matches.
If Left(Cells(d, 4), aux) = Left(str, aux) Then
Cells(a, 2) = Cells(d, 4)
Exit For
ElseIf Right(Cells(d, 4), aux) = Right(str, aux) Then
Cells(a, 2) = Cells(d, 4)
Exit For
End If
Upvotes: 0
Reputation: 131
I've found the right way to do it with everyone's help. Here it is:
If InStr(1, Cells(d, 4), Left(str, aux)) = 1 Then
Cells(a, 2) = Cells(d, 4)
Exit For
ElseIf InStr(Cells(d, 4), Right(str, aux)) + strLen - aux = strLen Then
Cells(a, 2) = Cells(d, 4)
Exit For
End If
Upvotes: 3