c# multithreaded downloading of webpages using proxies - performance issues

Question

I'm making a multithreaded proxy checker and I have my own multithreading algorithm which basically starts up a bunch of threads (50+) with each thread connecting to a webpage and just simply downloading and checking the response. If the response contains a certain string I assume that the proxy is working. Now the problem occurs when each of the 50 threads tries to download the webpage at the same time. The webpage itself is 400kb in size so it can take sometime to download using proxies. When I'm doing this without proxies I get 50 results in almost no time. But with proxies I get results in "batches" of 2 or 3 and it's too slow. I'm using a simple WebClient object with a small timeout. I have a 100Mbit connection and it's only using about 10% of that. I tried to find a few solutions online with no luck. The multithreading part of the code works without any problems because I have used it in numerous projects before and it is polished out.

With some playing around I found out that all of the 50 threads eventually get to the same line of code (at the exact same time) which downloads the source of the page but then stop.

    result = webClient.DownloadString(url);

I added a simple before and after timer to this line to test how long the downloading takes. One would assume that it would not take anymore than 5 seconds (since that is the timeout). The timers are huge and just piling up (up to even 120seconds).. So I guess there is a limit somewhere of how many active connections can be alive. Since I have 50 threads running at the same time I also want to be downloading 50 pages at once and not wait for previous ones to finish.

I have tried using:

    System.Net.ServicePointManager.DefaultConnectionLimit = int.MaxValue;

with no luck however. This is my code:

    public class AwesomeWebClient : WebClient
    {
        protected override WebRequest GetWebRequest(Uri address)
        {
            WebRequest request = base.GetWebRequest(address);
            request.Timeout = 5000;
            HttpWebRequest webRequest = request as HttpWebRequest;
            return request;
        }
    }

    private static string Get(string url, string proxy, string UA)
    {
        string result = "";

        try
        {
            var webClient = new AwesomeWebClient();
            webClient.Headers.Add("Referer", "http://yahoo.com");
            webClient.Headers.Add("X-Requested-With", "XMLHttpRequest");
            webClient.Headers.Add("Accept", "*");
            webClient.Headers.Add("User-Agent", UA);
            webClient.Proxy = new WebProxy(proxy);
            result = webClient.DownloadString(url);
        }
        catch (Exception x)
        {
            //Console.WriteLine(x.Message + " | " + url);
        }

        return result;
    }

c# multithreaded downloading of webpages using proxies - performance issues

Answers (1)

Related Questions