lateralus
lateralus

Reputation: 1020

VB NET regexp matching numerical substrings

I'm trying to make a vb function that takes as input a String and returns, if exist, the string made of numeric digits from the beginning until the first non numerical char, so:

123 -> 123
12f -> 12
12g34 -> 12
f12 -> ""
"" -> ""

I wrote a function that incrementally compares the result matching the regex, but it goes on even on non numeric characters...

This is the function:

Public Function ParseValoreVelocita(ByVal valoreRaw As String) As String

        Dim result As New StringBuilder
        Dim regexp As New Regex("^[0-9]+")
        Dim tmp As New StringBuilder
        Dim stringIndex As Integer = 0
        Dim out As Boolean = False

        While stringIndex < valoreRaw.Length AndAlso Not out
            tmp.Append(valoreRaw.ElementAt(stringIndex))
            If regexp.Match(tmp.ToString).Success Then
                result.Append(valoreRaw.ElementAt(stringIndex))
                stringIndex = stringIndex + 1
            Else
                out = True
            End If
        End While

        Return result.ToString

    End Function

The output always equals the input string, so there's something wrong and I can't get out of it...

Upvotes: 0

Views: 43

Answers (3)

Enigmativity
Enigmativity

Reputation: 117057

You have made your code very complex for a simple task.

Your loop keeps trying to build a longer string and it keeps checking if it is still working with digits, and if so keep appending results.

So and input string of "123x" would, if your code worked, produce a string of "112123" as output. In other words it matches the "1", then "12", then "123"and concatenates each before exiting after it finds the "x".

Here's what you should be doing:

Public Function ParseValoreVelocita(valoreRaw As String) As String
    Dim regexp As New Regex("^([0-9]+)")
    Dim match = regexp.Match(valoreRaw)
    If match.Success Then
        Return match.Groups(1).Captures(0).Value
    Else
        Return ""
    End If
End Function

No loop and you let the regex do the work.

Upvotes: 0

Tim Schmelter
Tim Schmelter

Reputation: 460098

Here's a LINQ solution that doesn't need regex and increases readability:

Dim startDigits = valoreRaw.TakeWhile(AddressOf Char.IsDigit)
Dim result As String = String.Concat(startDigits)

Upvotes: 3

JonM
JonM

Reputation: 1374

Try this instead. You need to use a capture group:

Public Function ParseValoreVelocita(ByVal valoreRaw As String) As String

    Dim result As New StringBuilder
    Dim regexp As New Regex("^([0-9]+)")
    Dim tmp As New StringBuilder
    Dim stringIndex As Integer = 0
    Dim out As Boolean = False

    While stringIndex < valoreRaw.Length AndAlso Not out
        tmp.Append(valoreRaw.ElementAt(stringIndex))
        If regexp.Match(tmp.ToString).Success Then
            result.Append(regexp.Match(tmp.ToString).Groups(1).Value)
            stringIndex = stringIndex + 1
        Else
            out = True
        End If
    End While

    Return result.ToString

End Function

The expression:

Dim regexp As New Regex("^([0-9]+)")

and the result appending lines have been updated:

result.Append(regexp.Match(tmp.ToString).Groups(1).Value)

Upvotes: 0

Related Questions