Whitekang
Whitekang

Reputation: 57

VB.NET: Searching for certain values in text

I have programmed a piece of code that reads a String and tries to get certain parts out of it.

In particular, I want to get the numbers that are contained in a custom textual written tag: [propertyid=]. For example [propertyid=541] would need to return me 541.

This search and retrieve happens in a text and needs to occur as often as the amount of tags there are in the text.

I already have written out code that works

Module Module1

    Sub Main()
        Dim properties As New List(Of String)
       'context of string doesn't matter, only the ids are important
        Dim text As String = "Dit is de voorbeeld string. Eerst komt er gewoon tekst. Daarna een property als [propertyid=1155641] met nog wat tekst. Dan volgt nog een [propertyid=1596971418413399] en dan volgt het einde."
        Dim found As Integer = 1

        Do
            found = InStr(found, text, "[propertyid=")
            If found <> 0 Then
                properties.Add(text.Substring(found + 11, text.IndexOf("]", found + 11) - found - 11).Trim())
                found = text.IndexOf("]", found + 11)
            End If
        Loop While found <> 0




        Console.WriteLine("lijst")
        For Each itemos As String In properties
            Console.WriteLine(itemos)
        Next
    End Sub

End Module

But I can't help but feel like this isn't optimal. I'm pretty sure this can be written way easier or with the help of other tools than Substring and IndexOf. Especially so, because of the fact that I need to play a bit with the indexes and the loop.

Any suggestions for improving this piece of code?

Upvotes: 1

Views: 157

Answers (1)

Robin Mackenzie
Robin Mackenzie

Reputation: 19299

You can use regular expressions for this kind of task.

In this case, the pattern to match [propertyid=NNNN] is:

\[propertyid=(\d+)\]

Which isolates a set of one or more digits - \d+ - in a capture group (the parentheses) so it can be retrieved by the matching engine.

Here's a code example:

Imports System.Text.RegularExpressions

Module Module1

    Sub Main()

        Dim properties As New List(Of String)
        'context of string doesn't matter, only the ids are important
        Dim text As String = "Dit is de voorbeeld string. Eerst komt er gewoon tekst. Daarna een property als [propertyid=1155641] met nog wat tekst. Dan volgt nog een [propertyid=1596971418413399] en dan volgt het einde."
        Dim pattern As String = "\[propertyid=(\d+)\]"

        For Each m As Match In Regex.Matches(text, pattern)
            properties.Add(m.Groups(1).Value)
        Next

        For Each s As String In properties
            Console.WriteLine(s)
        Next

        Console.ReadKey()


    End Sub

End Module

Upvotes: 4

Related Questions