user260845
user260845

Reputation:

In VB.NET create a string from parts of a different string

This string is automatically generated with an application I can't access or change: "http://www.site.com/locale=euen&mag=testit&issue=322&page=5&template=testit-t1"

I need to change the string to

"http://www.site.com/322_5"

where:

But it could also be a different issue or page number, it's not known beforehand.

How do I do this in VB.NET? (It has to be VB.NET) I've tried things with split and compare, but I'm a disaster with dissecting strings. Help would be most welcome!

EDIT:
after trying the solution of Konrad below, I get an error when I try to run the string through it. All the other URLs keep working fine, but as soon as I put one in in the format that needs to be converted, it errs.

I suspect this is because the conversion function is part of yet another function, and I'm doing something wrong when trying to put the regex function in. This is the complete function:

        Function ExpandLine(ByRef sLine, ByVal nStart)
        'Purpose: adapt expandLine into a funciton that replaces
        ' '       the urls form the UNIT with redirects
        ' '
        ' ' Purpose: This function searches recursively
        ' '          for strings embedded in "{" and "}" pairs.
        ' '          These strings contain a left and right part
        ' '          separated by ";".  The left part will be
        ' '          hyperlinked with the right part.
        ' '
        ' ' Input:   sLine - string to be expanded
        ' '          nStart - where to start the expansion from
        ' '          the right (normally set to -1)
        ' '
        ' ' Output:  sLine - expanded string
        ' '
        ' ' Example: This line contains a {hyperlink;http://www.site.com}
        ' '          that points to the homepage

        Dim n, n1, n2 As Integer
        Dim sUrl As String

        If nStart <> 0 Then
            n = InStrRev(sLine, "{", nStart)
            If n <> 0 Then
                n1 = InStr(n, sLine, ";")
                n2 = InStr(n, sLine, "}")
                If Not (n1 = 0 Or n2 = 0) Then
                    sUrl = Mid(sLine, n1 + 1, n2 - n1 - 1)

                    'use RegEx to determine if its an UNIT url
                    Const TestPattern = _
                      "^http://[^/]+/locale=[^&]+&mag=[^&]+&issue=[^&]+&page=[^&]+&template=[^&]+$"

                    Dim conformsToPattern = Regex.IsMatch(sUrl, TestPattern)

                    If conformsToPattern Then
                        Const SitePattern = "(http://[^/]+)/"
                        Const IssuePattern = "issue=(\d+)"
                        Const PagePattern = "page=(\d+)"

                        Dim sSite = Regex.Match(sUrl, SitePattern).Groups(1).Value
                        Dim sIssue = Regex.Match(sUrl, IssuePattern).Groups(1).Value
                        Dim sPage = Regex.Match(sUrl, PagePattern).Groups(1).Value

                        sUrl = String.Format("{1}/{2}_{3}", sSite, sIssue, sPage)
                    End If

                    sLine = _
                      Left(sLine, n - 1) & "<a class=""smalllink"" target=""_new"" href=""" & _
                      sUrl & """>" & Mid(sLine, n + 1, n1 - n - 1) & "</a>" & _
                      Right(sLine, Len(sLine) - n2)
                    ExpandLine(sLine, n - 1)
                End If
            End If
        End If
    End Function

Is the problem in the following line?

sUrl = String.Format("{1}/{2}_{3}", sSite, sIssue, sPage)?

Upvotes: 2

Views: 1201

Answers (1)

Konrad Rudolph
Konrad Rudolph

Reputation: 545588

You want regular expressions:

Const SitePattern = "(http://[^/]+)/"
Const IssuePattern = "issue=(\d+)"
Const PagePattern = "page=(\d+)"

Dim site = Regex.Match(input, SitePattern).Groups(1).Value
Dim issue = Regex.Match(input, IssuePattern).Groups(1).Value
Dim page = Regex.Match(input, PagePattern).Groups(1).Value

Dim result = String.Format("{1}/{2}_{3}", site, issue, page)

This searches, respectively, for the name of the website domain (including the leading http://, and delimited by the first following forward slash), the number that follows after the issue parameter and the number that follows after the page parameter.

It then constructs the result string from these three findings.

Searching for numbers in regular expressions is done via \d+, where \d matches any digit, and + tells the engine to match at least one, and arbitrarily many.

For the web site, we allow any character, except the forward slash ([^/] – this is a character group and the leading ^ tells the engine to negate the group, i.e. match everything not in it).

EDIT: If you first want to test whether the input actually conforms to your pattern, you may do the following. Notice, however, that this test is sensitive to the order of the GET parameters and I’d take this as a warning sign to do it differently: since the order of GET parameters in a URL isn’t important, can you guarantee that it will stay the same?

Const TestPattern = "^http://[^/]+/locale=[^&]+&mag=[^&]+&issue=[^&]+&page=[^&]+&template=[^&]+$"

Dim conformsToPattern = Regex.IsMatch(input, TestPattern)

If conformsToPattern Then
    ' Yes, go ahead. '
Else
    ' Nope, leave it unchanged. '
End If

This just checks that the whole string (from start = ^ to end = $) is matched by the pattern. The variable parameter values are all encoded as [^&]+, i.e. several characters ≠ & (which is the delimiter of the parameters).

Upvotes: 2

Related Questions