Reputation: 5063
I'm trying to code a vb.net function to extract specific text content from tags; I wrote this function
Public Function GetTagContent(ByRef instance_handler As String, ByRef start_tag As String, ByRef end_tag As String) As String
Dim s As String = ""
Dim content() As String = instance_handler.Split(start_tag)
If content.Count > 1 Then
Dim parts() As String = content(1).Split(end_tag)
If parts.Count > 0 Then
s = parts(0)
End If
End If
Return s
End Function
But it doesn't work, for example with the following debug code
Dim testString As String = "<body>my example <div style=""margin-top:20px""> text to extract </div> <br /> another line.</body>"
txtOutput.Text = testString.GetTagContent("<div style=""margin-top:20px"">", "</div>")
I get only "body>my example" string, instead of "text to extract"
can anyone help me? tnx in advance
I wrote a new routine and the following code works however I would know if exists a better code for performance:
Dim s As New StringBuilder()
Dim i As Integer = instance_handler.IndexOf(start_tag, 0)
If i < 0 Then
Return ""
Else
i = i + start_tag.Length
End If
Dim j As Integer = instance_handler.IndexOf(end_tag, i)
If j < 0 Then
s.Append(instance_handler.Substring(i))
Else
s.Append(instance_handler.Substring(i, j - i))
End If
Return s.ToString
Upvotes: 1
Views: 1510
Reputation: 43743
XPath is one way of accomplishing this task. I'm sure others will suggest LINQ. Here's an example using XPath:
Dim testString As String = "<body>my example <div style=""margin-top:20px""> text to extract </div> <br /> another line.</body>"
Dim doc As XmlDocument = New XmlDocument()
doc.LoadXml(testString)
MessageBox.Show(doc.SelectSingleNode("/body/div").InnerText)
Obviously, a more complex document may require a more complex xpath than simply "/body/div"
, but it's still pretty simple.
If you need to get a list of multiple elements that match the path, you can use doc.SelectNodes
.
Upvotes: 2