Reputation: 1678
Attempting to correct an HTML table that is incorrectly formatted. I do not have control over the source, my application just loads the contents of a downloaded file as a regular text file. The file contents are a simple HTML table that is missing the closing </tr>
elements. I'm attempting to split the contents on <tr>
to get an array to which I can a </tr>
to the end of the elements that need it. When I attempt to split the string using fleContents.Split("<tr>").ToList
I'm getting a lot more elements in the resulting List(Of String)
than there should be.
Here I a short little test code that shows the same behavior:
Dim testSource As String = "<table><tr><td>8172745</td><tr><td>8172745</td></table>"
Dim testArr As String() = testSource.Split("<tr>")
'Maybe try splitting on a variable because you can't use a string literal containging "<>" in the Split method
Dim seper as String = "<tr>"
testArr As String() = testSource.Split(seper)
'feed it a new string directly
testArr = testSource .Split(New String("<tr>"))
I would expect that testArr
should contain 3 elements, as follows:
"<table>"
"<td>8172745</td>"
"<td>8172745</td></table>"
However, I am receiving the following array:
""
"table>"
"tr>"
"td>8172745"
"/td>"
"tr>"
"td>8172954"
"/td>"
"/table>"
Can someone please explain why the strings are being split the way they are and how I can go about getting the results I'm expecting?
Upvotes: 1
Views: 297
Reputation: 245419
Your code is using a different overload of the Split
method than you're expecting. You want the method that takes a String[]
and StringSplitOptions
parameter:
Dim testSource As String = "<table><tr><td>8172745</td><tr><td>8172745</td></table>"
Dim delimeter As String() = { "<tr>" }
Dim testArr As String() = _
testSource.Split(delimeter, StringSplitOptions.RemoveEmptyEntries)
You can see it working at IDEOne:
Upvotes: 2
Reputation: 11
Try to use Regex like that
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim testSource As String = "<table><tr><td>8172745</td><tr><td>8172745</td></table>"
Dim testArr As String() = Regex.Split(testSource, "<tr>")
'Show The Array in TextBox1
TextBox1.Lines = testArr
End Sub
End Class
All The Best
Upvotes: 1