William Dunne
William Dunne

Reputation: 479

WebBrowser.Document is null on return from thread, not updating in new thread

public static User registerUser()
{
    Uri test = new Uri("https://www.example.com/signup");
    HtmlDocument testdoc = runBrowserThread(test);

    string tosend = "test";

    User user = new User();

    user.apikey = tosend;

    return user;

}
public static HtmlDocument runBrowserThread(Uri url)
{
    HtmlDocument value = null;
    var th = new Thread(() =>
    {
        var br = new WebBrowser();
        br.DocumentCompleted += browser_DocumentCompleted;
        br.Navigate(url);
        value = br.Document;
        Application.Run();
    });
    th.SetApartmentState(ApartmentState.STA);
    th.Start();
    th.Join(8000); 
    return value;
}

static void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    var br = sender as WebBrowser;
    if (br.Url == e.Url)
    {
        Console.WriteLine("Natigated to {0}", e.Url);
        Console.WriteLine(br.Document.Body.InnerHtml);
        System.Console.ReadLine();
        Application.ExitThread();   // Stops the thread
    }
}

I am trying to scan this page, and while it does get the HTML it does not pass it back in to the function call, but instead sends back null (I presume that is post processing).

How can I make it so that the new thread passes back its result?

Upvotes: 0

Views: 210

Answers (2)

William Dunne
William Dunne

Reputation: 479

Moved over to using WaitN instead of the default browser model. A bit buggy but now works like it should do.

using (var browser = new FireFox("https://www.example.com/signup"))
        {
            browser.GoTo("https://example.com/signup");
            browser.WaitForComplete();


        }

Upvotes: 0

Sriram Sakthivel
Sriram Sakthivel

Reputation: 73502

There are several problems with your approach.

  • You're not waiting till the webpage is navigated, I mean till Navigated event. So document could be null till then.
  • You're quitting after 8 seconds, if page takes more than 8 seconds to load you won't get the document.
  • If document isn't properly loaded, you're leaving the thread alive.
  • I guess WebBrowser control will not work as expected unless you add it into a form and show it(it needs to be visible in screen).

Etc..

Don't mix up things. Your goal can't be to use WebBrowser. If you need to just download the string from webpage, use HttpClient.GetStringAsync.

Once you get the page as string format, If you want to manipulate the html, use HtmlAgilityPack.

Upvotes: 1

Related Questions