Read XML from a webbrowser component

I have an application that gives an individual a preview of the XML page in a webbrowser component after NavUserPassword authentication and then shows a side panel that parses it into meaningful data. However, I cannot seem to find an effective way to export all the XML out of the webbrowser component via a string.

An example of the webpage without authentication is, https://services.odata.org/Northwind/Northwind.svc/

I have this code below, though it throws an MssingMemberExeption "Public member 'XMLDocument' on type 'HTMLDocumentClass' not found."

Private Sub WebBrowserAuthEx1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowserAuthExt1.DocumentCompleted
    Dim doc As XmlDocument = New XmlDocument()
    doc.LoadXml(WebBrowserAuthExt1.Document.DomDocument.XMLDocument) ' I throw MssingMemberExeption
    MessageBox.Show(doc.Value.ToString)
End Sub

How can I get this XML DOM in webbrowser to give me all the XML?

It is the same as a normal webbrowser, but the XML must come out of it as it is authenticated, and I don't want to authenticate another stream.

Upvotes: 0

Views: 189

Answers (2)

TnTinMn
TnTinMn

Reputation: 11801

For the example Url that you provided, you can obtain the xml with something like the following code:

Dim xmlText As String = WebBrowser1.Document.All.Item(0).InnerText

Edit: The OP pointed out (in a rejected edit) that the text returned by the above returns a "- " on some lines. This is a consequence of the source being formatted as tree structure and not as pure XML. Their solution was the following:

' It also includes the code folding dashes, use the below to sanitize the data.
If xmlText <> Nothing Then
    xmlText = xmlText.Replace("- ", "")
End If

This usage of Replace risks unintended modification of data and I just wanted to suggest the following alternative that limits potential changes only to the beginning of lines.

Dim sb As New System.Text.StringBuilder(xmlText.Length)
Using sr As New System.IO.StringReader(xmlText)
    Do While sr.Peek <> -1
        Dim line As String = sr.ReadLine()
        Dim startOfLineIndex As Int32 = sb.Length
        sb.AppendLine(line)
        If sb.Chars(startOfLineIndex) = "-"c Then sb.Chars(startOfLineIndex) = " "c
    Loop
End Using
xmlText = sb.ToString()

Upvotes: 2

Visual Vincent
Visual Vincent

Reputation: 18310

If this is the built-in System.Windows.Forms.WebBrowser control you can use the DocumentText property to get the website's HTML (basically XML) code.

doc.LoadXml(WebBrowserAuthExt1.DocumentText)

Upvotes: 0

Related Questions