Reputation: 184
I have an application that gives an individual a preview of the XML page in a webbrowser component after NavUserPassword authentication and then shows a side panel that parses it into meaningful data. However, I cannot seem to find an effective way to export all the XML out of the webbrowser component via a string.
An example of the webpage without authentication is, https://services.odata.org/Northwind/Northwind.svc/
I have this code below, though it throws an MssingMemberExeption "Public member 'XMLDocument' on type 'HTMLDocumentClass' not found."
Private Sub WebBrowserAuthEx1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowserAuthExt1.DocumentCompleted
Dim doc As XmlDocument = New XmlDocument()
doc.LoadXml(WebBrowserAuthExt1.Document.DomDocument.XMLDocument) ' I throw MssingMemberExeption
MessageBox.Show(doc.Value.ToString)
End Sub
How can I get this XML DOM in webbrowser to give me all the XML?
It is the same as a normal webbrowser, but the XML must come out of it as it is authenticated, and I don't want to authenticate another stream.
Upvotes: 0
Views: 189
Reputation: 11801
For the example Url that you provided, you can obtain the xml with something like the following code:
Dim xmlText As String = WebBrowser1.Document.All.Item(0).InnerText
Edit: The OP pointed out (in a rejected edit) that the text returned by the above returns a "- " on some lines. This is a consequence of the source being formatted as tree structure and not as pure XML. Their solution was the following:
' It also includes the code folding dashes, use the below to sanitize the data.
If xmlText <> Nothing Then
xmlText = xmlText.Replace("- ", "")
End If
This usage of Replace
risks unintended modification of data and I just wanted to suggest the following alternative that limits potential changes only to the beginning of lines.
Dim sb As New System.Text.StringBuilder(xmlText.Length)
Using sr As New System.IO.StringReader(xmlText)
Do While sr.Peek <> -1
Dim line As String = sr.ReadLine()
Dim startOfLineIndex As Int32 = sb.Length
sb.AppendLine(line)
If sb.Chars(startOfLineIndex) = "-"c Then sb.Chars(startOfLineIndex) = " "c
Loop
End Using
xmlText = sb.ToString()
Upvotes: 2
Reputation: 18310
If this is the built-in System.Windows.Forms.WebBrowser
control you can use the DocumentText
property to get the website's HTML (basically XML) code.
doc.LoadXml(WebBrowserAuthExt1.DocumentText)
Upvotes: 0