Reputation: 683
I am confused about why my parallels.for loop is constantly blowing up on a httpclient call. The code works for about 10-15 requests and then will hang for a long time and error out on with System.AggregateException I have tried multiple different variations including webclient. please consider the following:
class Program
{
static void Main(string[] args)
{
Parallel.For(0, 250, i =>
{
var task = GetPages.CallHttp();
task.Wait();
var content = task.Result;
TestObject obj = content.ToTestObject(); //take the string and find some values in it.
Console.WriteLine(obj.H1Tag);
});
}
}
public static class GetPages
{
private static readonly HttpClient client = new HttpClient() {Timeout = new TimeSpan(0,5,0)};
public static async Task<string> CallHttp()
{
client.DefaultRequestHeaders.UserAgent.ParseAdd("customAgent/1.0");
string astr = await client.GetStringAsync("the url I am testing").ConfigureAwait(false);
return astr;
}
}
public static class StringExtensions
{
private static readonly object objLock = new object();
public static TestObject ToTestObject(this string content)
{
lock (objLock)
{
var obj = new TestObject();
// creates a bunch of properties inspecting the html string
var result = new HtmlExtractions(content);
obj.PageTitle = result.PageTitle;
obj.H1Tag = result.H1Tag;
...
return obj;
}
}
}
public class HtmlExtractions
{
internal HtmlDocument doc;
public HtmlExtractions(string contentToRead)
{
doc = new HtmlDocument();
doc.LoadHtml(contentToRead);
}
public string PageTitle => doc.DocumentNode.Descendants("title").FirstOrDefault()?.InnerHtml.Replace("&", "&").Trim();
...
}
the result is that is throws the following exception.
System.AggregateException was unhandled by user code
HResult=-2146233088
Message=One or more errors occurred.
Source=mscorlib
StackTrace:
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at ConsoleApplication1.Program.<>c.<Main>b__0_0(Int32 i) in c:\users\username\documents\visual studio 2015\Projects\ConsoleApplication1\Program.cs:line 27
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
InnerException:
HResult=-2146233029
Message=A task was canceled.
InnerException: Id = 50, Status = Canceled, Method = "{null}", Result = "{Not yet computed}"
******Update as per Todd's suggestion // comments in the code. So frustrating. Even scaling to 5 requests results in hanging. ******
static async void RunPagesAsync()
{
Console.WriteLine("getting contents");
var tasks = Enumerable.Range(0, 5).Select(i => GetPages.CallHttp());
var contents = await Task.WhenAll(tasks);
Console.WriteLine("Got Contents..continuing");
foreach (var content in contents)
{
TestObject obj = content.ToTestObject(); //take the string and find some values in it.
Console.WriteLine(obj.H1Tag);
}
Console.WriteLine("completed");
}
static void Main(string[] args)
{
//Task.Run(() => RunPagesAsync()); //doesn't work.
// RunPagesAsync(); //just hangs after what looks like 2 itterations
// var tasks = Enumerable.Range(0, 5).Select(i => GetPages.CallHttp());
// var contents = await Task.WhenAll(tasks); //won't compile under synch Main dues to await
// foreach (var content in contents)
// {
// TestObject obj = content.ToTestObject(); //take the string and find some values in it.
// Console.WriteLine(obj.H1Tag);
// }
var tasks1 = Enumerable.Range(0, 5).Select(i => GetPages.CallHttp());
var contents1 = Task.WhenAll(tasks1);
contents1.Wait();
foreach (var content in contents1.Result)
{
TestObject obj = content.ToTestObject(); //take the string and find some values in it.
Console.WriteLine(obj.H1Tag);
}
Upvotes: 0
Views: 1100
Reputation: 39299
Async tasks (which free threads) and Parallel.For
(which forces the use of multiple threads) don't tend to mix well. .Wait()
and .Result
are both blocking calls and invite deadlocks when used with async tasks. Try re-writing your Main
method using Task.WhenAll
and avoiding the blocking calls:
var tasks = Enumerable.Range(0, 250).Select(i => GetPages.CallHttp());
var contents = await Task.WhenAll(tasks);
foreach (var content in contents)
{
TestObject obj = content.ToTestObject(); //take the string and find some values in it.
Console.WriteLine(obj.H1Tag);
}
Upvotes: 1