user2163343
user2163343

Reputation: 101

HTMLAgility speeding up webget

Fairly new to writing something web scraping, so I apologize for that. I'm trying to reduce the run time of this application. When running through a list of about 100 stocks takes over 30 seconds (I just included a list of 5 for brevity). Is there any way to improve the efficiency with threading/asynchronous programming? I may be running into a limit of how much yahoo servers wants to send back at once to a single IP. Ultimately my goal is to create a class "stock" which a bunch of properties that will fetch web based data like this.

    static void Main(string[] args)
    {
        List<string> stocks = new List<string>() { "AA", "AAL", "AAPL", "ABX", "ADBE" };
        foreach (var stock in stocks)
        {
            Task.Factory.StartNew(() => { getPrice(stock); });
        }
        Console.ReadLine();

    }
    private static void getPrice(string stock)
    {
        var webGet = new HtmlWeb();
        var doc = webGet.Load("http://finance.yahoo.com/q?s=" + stock);
        HtmlNode ourNode = doc.DocumentNode.SelectSingleNode("//*[@id=\"yfs_l84_" + stock.ToString().ToLower() + "\"]");
        if (ourNode != null)
        {
            Console.WriteLine(stock + ": " + ourNode.InnerText);
        }
    }

Upvotes: 2

Views: 26

Answers (1)

Stefano Castriotta
Stefano Castriotta

Reputation: 2913

Use the Parallel.ForEach loop, but don't expect a big improvement, because the speed depends 99% on the yahoo response time.

Parallel.ForEach(stocks, stock =>
{
    getPrice(stock);
});

With Parallel.ForEach you can also set the degree of parallelism (how many concurrent actions are being executed).

Parallel.ForEach(stocks, new ParallelOptions() { MaxDegreeOfParallelism = 3 }, stock =>
{
    getPrice(stock);
});

For more information, have a look at the MSDN documentation: https://msdn.microsoft.com/en-us/library/dd460720%28v=vs.110%29.aspx and https://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel%28v=vs.110%29.aspx

Upvotes: 1

Related Questions