WebEngine
WebEngine

Reputation: 157

Awesomium - How can I get HTML source with frameset

I want to get HTML source to analysis the web page. So, I use the code like this:

(Work.URL is just String variable in structure)

Dim View As WebView = WebCore.CreateWebView(1000, 600, WebCore.Sessions.Last())
View.Source = New Uri(Work.URL)

AddHandler View.LoadingFrameComplete, Sub(sender As Object, e As FrameEventArgs)
    If Not e.IsMainFrame Then Exit Sub
    Console.WriteLine(View.HTML)
End Sub

The code is work well. And Sample Result:

<!doctype html>
<html>
    <head>
        ...
    </head>
    <frameset cols="*,*">
        <frame src="test1.html" />
        <frame src="test2.html" />
    </frameset>
</html>

But, I want to get HTML source within frameset like this:

(Like Chrome Developer Tool)

<!doctype html>
<html>
    <head>
        ...
    </head>
    <frameset cols="*,*">
        <frame src="test1.html">
            <!doctype html>
            <html>
                <head>
                    ...
                </head>
                <body>
                    This page is TEST1.
                </body>
            </html>
        </frame>
        <frame src="test2.html">
            <!doctype html>
            <html>
                <head>
                    ...
                </head>
                <body>
                    This page is TEST2.
                </body>
            </html>
        </frame>
    </frameset>
</html>

How can I get HTML source?

Upvotes: 1

Views: 1819

Answers (2)

Xan-Kun Clark-Davis
Xan-Kun Clark-Davis

Reputation: 2843

This is a build in function, that gives you the static html code that was set when loading the page. The timing for this one is crutial:

 webControl.HTML;

This function uses javascript to get the actual, dynamic source code of the page. This is what you would see in FireBug:

 webControl.ExecuteJavascriptWithResult("document.getElementsByTagName('html')[0].innerHTML");

I prefer:

 form.webControl.ExecuteJavascriptWithResult("document.documentElement.outerHTML");

I also read, that they are working on a "source" property that will hide the timing issues and will hopefully give the real html.

Upvotes: 0

voytek
voytek

Reputation: 2222

This is one way to get source code:

string source = webControl.ExecuteJavascriptWithResult("document.getElementsByTagName('html')[0].innerHTML");

or.. you can try this

string source = webControl.HTML;

EDIT: remember when using webControl.HTML, you need to wait till document is loaded: DocumentReadyState.Loaded

Upvotes: 1

Related Questions