C# multi-threaded foreach loop

Question

I've recently started working on multi-threading calls with C# and I'm unsure whether it's correct or not.

How can I make this go faster ? I'm guessing it's with Parallelism, but I have not been successful in integrating that concept into this.

Edits

Please note this is running in a distant VM and it's a console program; meaning user experience is not an issue. I just want this to run fast, since number of links may go up to 200k elements and we want results as soon as possible. I also removed all questions but one, since it's the one I would like some help.

Here is my code which seems to work:

// Use of my results
public void Main() 
{
  var results = ValidateInternalLinks();
  // Writes results to txt file
  WriteResults(results.Result, "Internal Links");
}

// Validation of data
public async Task> ValidateInternalLinks() 
{
  var tasks = new List();
  var InternalLinks = new List();
  // Populate InternalLinks with the data

  foreach (var internalLink in InternalLinks)
  {
    tasks.Add(GetResults(internalLink));
  }

  await Task.WhenAll(tasks);

  return InternalLinks;
}

// Get Results for each piece of data
public async Task GetResults(InternalLinksModel internalLink)
{ 
  var response = await SearchValue(internalLink.SearchValue);
  
// Analyse response and change result (possible values: SUCCESS, FAILED, [])
  internalLink.PossibleResults = ValidateSearchResult(response);
}

// Http Request
public async Task SearchValue(string value) 
{
  // RestSharp API creation and headers addition
  var response = await client.ExecuteTaskAsync(request);

  return JsonConvert.DeserializeObject(response.Content);
}

Yuli Bonner · Accepted Answer

async/await/WhenAll is the correct way to go, your performance bottleneck is likely I/O bound (HTTP requests) not compute bound. Asynchrony is appropriate tool to handle this. How many HTTP requests are you making and are they all to the same server? If so, you may be hitting a connection limit. I'm not very familiar with RestSharp, but you might try increasing the connection limit via ServicePointManager. The more outstanding requests you have, assuming the server can handle them, the faster the WhenAll will complete.

https://learn.microsoft.com/en-us/dotnet/api/system.net.servicepointmanager?view=netframework-4.8

All of that said, I would reorganize your code. Use Task/WhenAll for your HTTP requests. And process the responses after the WhenAll completes. If you do this you can determine with certainty if the HTTP requests are where the bottleneck is, by setting a breakpoint after the WhenAll observing the execution times. If you can't debug, you can log the execution time. This should give you an idea if the bottleneck is primarily network I/O. I'm pretty confident it is.

If it turns out that there is a compute bottleneck, you can use a Parallel.ForEach loop to deserialize, validate, and assign.

            var internalLinks = new List();
            // Populate InternalLinks with the data
            // I'm assuming this means internalLinks is assumed to contain data. If not I'm not sure I understand your code.
            var dictionary = new Dictionary(); //You shouldn't need a concurrent dictionary since you'll only be doing reads in parallel.

            //make api calls - I/O bound
            foreach (var l in internalLinks)
            {
                dictionary[client.ExecuteTaskAsync(l.SearchValue)] = l;
            }

            await Task.WhenAll(dictionary.Keys);    
            // I/O is done.

            // Compute bound - deserialize, validate, assign.
            Parallel.ForEach(dictionary.Keys, (task) =>
            {
                var responseModel = JsonConvert.DeserializeObject(task.Result.Content);
                dictionary[task].PossibleResults = ValidateSearchResult(responseModel);
            });


            // Writes results to txt file
            WriteResults(dictionary.Values, "Internal Links");

C# multi-threaded foreach loop

Edits

Answers (2)

Related Questions