user1570048
user1570048

Reputation: 880

VB.net how to make stream reader ignore some line?

i am using a stream reader to get the HTML of some page, but there are lines that i want to ignore, such as if a line starts with <span> 

any advice? Here is my function

Public Function GetPageHTMLReaderNoPrx(ByVal address As Uri) As StreamReader
  Dim request As HttpWebRequest
  Dim response As HttpWebResponse = Nothing
  Dim reader As StreamReader

  Try
    request = DirectCast(WebRequest.Create(address), HttpWebRequest)
    response = DirectCast(request.GetResponse(), HttpWebResponse)

    Select Case CType(response, Net.HttpWebResponse).StatusCode
      Case 200
        reader = New StreamReader(response.GetResponseStream(), Encoding.Default)

      Case Else
        MsgBox(CType(response, Net.HttpWebResponse).StatusCode)
    End Select
  Catch
    If Not response Is Nothing Then response.Close()
  End Try
  Return reader
End Function

this is how the HTML looks like

<tr>Text
<span>show all</span>
</tr>

Upvotes: 0

Views: 773

Answers (1)

Victor Zakharov
Victor Zakharov

Reputation: 26424

If you insist on using strings, you could do something like this:

Do
  Dim line As String = reader.ReadLine()
  If line Is Nothing Then Exit Do 'end of stream
  If line.StarsWith("<span>") Then Exit Do 'ignore this line
  'otherwise do some processing here
  '...
Loop

But this approach is not stable - any minor change in the input HTML can break your flow.

More elegant solution would be using XElement:

Dim xml = <tr>Text
            <span>show all</span>
          </tr>
xml.<span>.Remove()
MsgBox(xml.Value.Trim)

Upvotes: 1

Related Questions