Nate House
Nate House

Reputation: 17

Compare strings to identify duplicates

I have to write an isDup function to compare two tweets based on their similar word counts to determine if the tweets are duplicate, based on a decimal threshold chosen (0-1).

My process is to write a sub with two hardcoded tweets my prof has provided (just to get an understanding before converting to a function). I encountered a run time error 5.

Option Explicit
Sub isDup()
    Dim tweet1 As String
    Dim tweet2 As String
    Dim threshold As Double
        threshold = 0.7

    tweet1 = "Hours of planning can save weeks of coding"
    tweet2 = "Weeks of programming can save you hours of planning"

    Dim tweet1Split() As String
        tweet1Split = Split(tweet1, " ")

    Dim tweet2Split() As String
        tweet2Split = Split(tweet2, " ")

    Dim i As Integer
    Dim j As Integer

    Dim sameCount As Integer

    'my thought process below was to compare strings i and j to see if equal, and if true add 1 to sameCount,
    'but the If StrComp line is where the error is

    For i = LBound(tweet1Split) To UBound(tweet1Split) Step 1
        For j = LBound(tweet2Split) To UBound(tweet2Split) Step 1
            If StrComp(i, j, vbDatabaseCompare) = 0 Then
                sameCount = sameCount + 1
                Exit For
            End If
        Next j
    Next i
End Sub
   'here i wanted to get a total count of the first tweet to compare, the duplicate tweet is true based on the number of
   'similar words
    Function totalWords(tweet1 As String) As Integer
            totalWords = 0
        Dim stringLength As Integer
        Dim currentCharacter As Integer

            stringLength = Len(tweet1)

        For currentCharacter = 1 To stringLength

        If (Mid(tweet1, currentCharacter, 1)) = " " Then
            totalWords = totalWords + 1
        End If

        Next currentCharacter
    End Function

    'this is where i compute an "isDup score" based on similar words compared to total words in tweet1, in this
    'example the threshold was stated above at 0.7
    Dim score As Double
        score = sameCount / totalWords

    If score > threshold Then
        MsgBox "isDup Score: " & score & " ...This is a duplicate"
    Else
        MsgBox "isDup Score: " & score & " ...This is not a duplicate"
    End If

End Sub

Upvotes: 1

Views: 96

Answers (1)

urdearboy
urdearboy

Reputation: 14580

First issue:

i and j are just indexes. You want to compare the string that your index relates to so:

If StrComp(tweet1Split(i), tweet2Split(j), vbDatabaseCompare) = 0 Then

Second issue:

As noted in Microsoft documentation for StrComp, vbDatabaseCompare is reserved for Access, which you are not using, hence the source of your second error. You need to switch to a different comparison

Upvotes: 1

Related Questions