Reputation: 31
I'm actually working on program where I want to manually copy and paste a webpage source code and the program is supposed to find a certain part of the source code and cut it out from the rest of the string.
I can't find the way how to cut it out from the rest of the text.
So I have a string something like this:
"<b>abcdefgh qwertzuiop thepartineedtocut</b>abcdefght mnbvcxy"
And I need to get just:
"thepartineedtocut"
The problem is that it will not always be the same word but the words around it will not change. I hope you understand me. Thanks. Have a nice day.
Upvotes: 1
Views: 1793
Reputation: 751
If the text never changes then:
Dim input As String = "<b>abcdefgh qwertzuiop thepartineedtocut</b>abcdefght mnbvcxy"
input = input.replace("<b>abcdefgh qwertzuiop ","").replace("</b>abcdefght mnbvcxy","")
Upvotes: 0
Reputation: 26424
You can use regular expressions:
Dim input As String = "<b>abcdefgh qwertzuiop thepartineedtocut</b>abcdefght mnbvcxy"
Dim re As New System.Text.RegularExpressions.Regex("(\w+)</b>")
Console.WriteLine(re.Match(input).Groups(1).Value) 'outputs: thepartineedtocut
The rule here is : find a word immediately before the closing </b>
tag.
Upvotes: 1
Reputation: 460138
I would use HtmlAgilityPack
to parse html, but maybe this naive approach is sufficient:
If the rule is: "what is the string that is the last word in <b>
... </b>
":
Dim myString = "<b>abcdefgh qwertzuiop thepartineedtocut</b>abcdefght mnbvcxy"
Dim result As String = Nothing
Dim bTokenStart = myString.IndexOf("<b>", StringComparison.OrdinalIgnoreCase)
If bTokenStart >= 0 Then
bTokenStart += "<b>".Length
Dim bTokenEnd = myString.IndexOf("</b>", bTokenStart, StringComparison.OrdinalIgnoreCase)
If bTokenEnd >= 0 Then
Dim bToken = myString.Substring(bTokenStart, bTokenEnd - bTokenStart)
result = bToken.Split({" "}, StringSplitOptions.RemoveEmptyEntries).Last() ' thepartineedtocut
End If
End If
Upvotes: 1