Blaise
Blaise

Reputation: 22212

How to call WebBrowser Navigate to go through a number of urls?

To collect information on a webpage, I can use the WebBrowser.Navigated event.

First, navigate to the url:

WebBrowser wbCourseOverview = new WebBrowser();
wbCourseOverview.ScriptErrorsSuppressed = true;
wbCourseOverview.Navigate(url);
wbCourseOverview.Navigated += wbCourseOverview_Navigated;

Then process the webpage when Navigated is called:

void wbCourseOverview_Navigated(object sender, WebBrowserNavigatedEventArgs e)
    {
        //Find the control and invoke "Click" event...
    }

The difficult part comes when I try to go through a string array of urls.

foreach (var u in courseUrls)
        {
            WebBrowser wbCourseOverview = new WebBrowser();
            wbCourseOverview.ScriptErrorsSuppressed = true;
            wbCourseOverview.Navigate(u);

            wbCourseOverview.Navigated += wbCourseOverview_Navigated;
        }

Here, because the page load takes time, wbCourseOverview_Navigated is never reached.

I tried to use the async await in C#5. Tasks and the Event-based Asynchronous Pattern (EAP) is found in here. Another example can be found in The Task-based Asynchronous Pattern.

The problem is WebClient has async method like DownloadDataAsync and DownloadStringAsync. But there is no NavigateAsync in WebBrowser.

Can any expert give me some advice? Thank you.


There is a post in StackOverflow (here). But, does anyone know how to implement that strut in its answer?


Update again.

Suggested in another post here in StackOverflow,

public static Task WhenDocumentCompleted(this WebBrowser browser)
{
    var tcs = new TaskCompletionSource<bool>();
    browser.DocumentCompleted += (s, args) => tcs.SetResult(true);
    return tcs.Task;
}

So I have:

foreach (var c in courseBriefs)
    {
        wbCourseOverview.Navigate(c.Url);
        await wbCourseOverview.WhenDocumentCompleted();
    }

It looks good until my web browser visits the second url.

An attempt was made to transition a task to a final state when it had already completed.

I know I must have made a mistake inside the foreach loop. Because the DocumentCompleted event has not been raised when it loops to the second round. What is the correct way to write this await in a foreach loop?

Upvotes: 1

Views: 9053

Answers (2)

alex.b
alex.b

Reputation: 4567

There is a post in StackOverflow (here). But, does anyone know how to implement that strut in its answer?

Ok, so you want some code with awaiter. I've made two pieces of code. The first one uses TPL's built-in awaiter:

 public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            ProcessUrlsAsync(new[] { "http://google.com", "http://microsoft.com", "http://yahoo.com" })
                .Start();
        }

        private Task ProcessUrlsAsync(string[] urls)
        {
            return new Task(() =>
            {
                foreach (string url in urls)
                {
                    TaskAwaiter<string> awaiter = ProcessUrlAsync(url);
                    // or the next line, in case we use method *
                    // TaskAwaiter<string> awaiter = ProcessUrlAsync(url).GetAwaiter();                     
                    string result = awaiter.GetResult();

                    MessageBox.Show(result);
                }
            });
        }        

        // Awaiter inside
        private TaskAwaiter<string> ProcessUrlAsync(string url)
        {
            TaskCompletionSource<string> taskCompletionSource = new TaskCompletionSource<string>();
            var handler = new WebBrowserDocumentCompletedEventHandler((s, e) =>
            {
                // TODO: put custom processing of document right here
                taskCompletionSource.SetResult(e.Url + ": " + webBrowser1.Document.Title);
            });
            webBrowser1.DocumentCompleted += handler;
            taskCompletionSource.Task.ContinueWith(s => { webBrowser1.DocumentCompleted -= handler; });

            webBrowser1.Navigate(url);
            return taskCompletionSource.Task.GetAwaiter();
        }

        // (*) Task<string> instead of Awaiter
        //private Task<string> ProcessUrlAsync(string url)
        //{
        //    TaskCompletionSource<string> taskCompletionSource = new TaskCompletionSource<string>();
        //    var handler = new WebBrowserDocumentCompletedEventHandler((s, e) =>
        //    {
        //        taskCompletionSource.SetResult(e.Url + ": " + webBrowser1.Document.Title);
        //    });
        //    webBrowser1.DocumentCompleted += handler;
        //    taskCompletionSource.Task.ContinueWith(s => { webBrowser1.DocumentCompleted -= handler; });

        //    webBrowser1.Navigate(url);
        //    return taskCompletionSource.Task;
        //}

And the next sample contains the sample implementation of awaiter struct Eric Lippert was talking about here.

public partial class Form1 : Form
    {
        public struct WebBrowserAwaiter
        {
            private readonly WebBrowser _webBrowser;
            private readonly string _url;

            private readonly TaskAwaiter<string> _innerAwaiter;

            public bool IsCompleted
            {
                get
                {
                    return _innerAwaiter.IsCompleted;
                }
            }

            public WebBrowserAwaiter(WebBrowser webBrowser, string url)
            {
                _url = url;
                _webBrowser = webBrowser;
                _innerAwaiter = ProcessUrlAwaitable(_webBrowser, url);
            }

            public string GetResult()
            {
                return _innerAwaiter.GetResult();

            }

            public void OnCompleted(Action continuation)
            {
                _innerAwaiter.OnCompleted(continuation);
            }

            private TaskAwaiter<string> ProcessUrlAwaitable(WebBrowser webBrowser, string url)
            {
                TaskCompletionSource<string> taskCompletionSource = new TaskCompletionSource<string>();
                var handler = new WebBrowserDocumentCompletedEventHandler((s, e) =>
                {
                    // TODO: put custom processing of document here
                    taskCompletionSource.SetResult(e.Url + ": " + webBrowser.Document.Title);
                });
                webBrowser.DocumentCompleted += handler;
                taskCompletionSource.Task.ContinueWith(s => { webBrowser.DocumentCompleted -= handler; });

                webBrowser.Navigate(url);
                return taskCompletionSource.Task.GetAwaiter();
            }
        }

        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            ProcessUrlsAsync(new[] { "http://google.com", "http://microsoft.com", "http://yahoo.com" })
                .Start();
        }

        private Task ProcessUrlsAsync(string[] urls)
        {
            return new Task(() =>
            {
                foreach (string url in urls)
                {
                    var awaiter = new WebBrowserAwaiter(webBrowser1, url);
                    string result = awaiter.GetResult();

                    MessageBox.Show(result);
                }
            });
        }
    }   
        }

Hope this helps.

Upvotes: 4

KF2
KF2

Reputation: 10143

Instead of using wbCourseOverview_Navigated use webBrowser1_DocumentCompleted when fist URL load completed done your job and go to next url

List<string> urls = new List<string>();
    int count = 0;
    public Form1()
    {
        InitializeComponent();
        webBrowser1.DocumentCompleted+=new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
    }
    private void Form1_Load(object sender, EventArgs e)
    {
        webBrowser1.Navigate(urls[count++]);
    }

    private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        //Do something
        webBrowser1.Navigate(urls[count++]);
    }

Upvotes: 0

Related Questions