JoP
JoP

Reputation: 45

Fastest way to conditionally strip off the right part of a string

I need to remove the numeric part at the end of a string. Here are some examples:

You get the idea.

I implemented a function which works well for this purpose.

Function VarStamm(name As String) As String
    Dim i, a As Integer
    a = 0
    For i = Len(name) To 1 Step -1
        If IsNumeric(Mid(name, i, 1)) = False Then
            i = i + 1
            Exit For
        End If
    Next i
    If i <= Len(name) Then
        VarStamm = name.Substring(0, i - 1)
    Else
        VarStamm = name
    End If
End Function

The question is: is there any faster (more efficient in speed) way to do this? The problem is, I call this function within a loop with 3 million iterations and it would be nice to have it be more efficient.

I know about the String.LastIndexOf method, but I don't know how to use it when I need the index of the last connected number within a string.

Upvotes: 1

Views: 85

Answers (3)

Steven Doggart
Steven Doggart

Reputation: 43743

I was skeptical that the Array.FindLastIndex method was actually faster, so I tested it myself. I borrowed the testing code posted by Amessihel, but added a third method:

Function VarStamm3(name As String) As String
    Dim i As Integer
    For i = name.Length - 1 To 0 Step -1
        If Not Char.IsDigit(name(i)) Then
            Exit For
        End If
    Next i
    Return name.Substring(0, i + 1)
End Function

It uses your original algorithm, but just swaps out the old VB6-style string methods for newer .NET equivalent ones. Here's the results on my machine:

RunTime :
 - VarStamm : 00:00:07.92
 - VarStamm2 : 00:00:00.60
 - VarStamm3 : 00:00:00.23

As you can see, your original algorithm was already quite well tuned. The problem wasn't the loop. The problem was Mid, IsNumeric, and Len. Since Tim's method didn't use those, it was much faster. But, if you stick with a manual for loop, it's twice as fast as using Array.FindLastIndex, all things being equal

Upvotes: 2

Amessihel
Amessihel

Reputation: 6424

Given your function VarStamm and Tim Schmelter's one named VarStamm2, here is a small test performance I wrote. I typed an arbitrary long String with a huge right part, and ran the functions one million times.

Module StackOverlow

    Sub Main()
        Dim testStr = "azekzoerjezoriezltjreoitueriou7657678678797897898997897978897898797989797"

        Console.WriteLine("RunTime :" + vbNewLine +
               " - VarStamm : " + getTimeSpent(AddressOf VarStamm, testStr) + vbNewLine +
               " - VarStamm2 : " + getTimeSpent(AddressOf VarStamm2, testStr))

    End Sub

    Function getTimeSpent(f As Action(Of String), str As String) As String
        Dim sw As Stopwatch = New Stopwatch()
        Dim ts As TimeSpan

        sw.Start()
        For i = 1 To 1000000
            f(str)
        Next
        sw.Stop()
        ts = sw.Elapsed
        Return String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
            ts.Hours, ts.Minutes, ts.Seconds,
            ts.Milliseconds / 10)
    End Function

    Function VarStamm(name As String) As String
        Dim i, a As Integer
        a = 0
        For i = Len(name) To 1 Step -1
            If IsNumeric(Mid(name, i, 1)) = False Then
                i = i + 1
                Exit For
            End If
        Next i
        If i <= Len(name) Then
            VarStamm = name.Substring(0, i - 1)
        Else
            VarStamm = name
        End If
    End Function

    Function VarStamm2(name As String) As String
        Dim lastNonDigitIndex = Array.FindLastIndex(name.ToCharArray(), Function(c) Not Char.IsDigit(c))

        If lastNonDigitIndex >= 0 Then
            lastNonDigitIndex += 1
            Return name.Substring(0, lastNonDigitIndex)
        End If
        Return name
    End Function
End Module

Here is the output I got:

RunTime :
 - VarStamm : 00:00:38.33
 - VarStamm2 : 00:00:02.72

So yes, you should choose his answer, his code is both pretty and efficient.

Upvotes: 2

Tim Schmelter
Tim Schmelter

Reputation: 460288

You can use Array.FindLastIndex and then Substring:

Dim lastNonDigitIndex = Array.FindLastIndex(text.ToCharArray(), Function(c) Not char.IsDigit(c))

If lastNonDigitIndex >= 0
    lastNonDigitIndex += 1
    Dim part1 = text.Substring(0, lastNonDigitIndex)
    Dim part2 = text.Substring(lastNonDigitIndex)
End If 

Upvotes: 3

Related Questions