red guardgen
red guardgen

Reputation: 43

How to increase performance of foreach search via threads or parallel extensions?

I am completely new to threading. I have such a method in which I am trying to implement parallel executions in thread safe manner(at least I hope so):

private void PerformSearch(List<FailedSearchReportModel> failedSearchReports)
    {
        foreach (var item in failedSearchReports)
        {
            item.SearchTerms = item.SearchTerms.Take(50).ToList();
            var siteId = ConstantsEnumerators.Constants.Projects.GetProjectIdByName(item.Site);
            if (SearchWrapperHelper.IsSeas(siteId))
            {
                item.UsedEngine = "Seas";
                var model = GetBaseQueryModel(item.Site);
                Parallel.ForEach(item.SearchTerms,
                         new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 2.0)) },
                         (term) =>
                     {
                         lock (seasSyncRoot)
                         {
                             CheckSearchTermInSeas(model, term, item.Site, item.Language);
                         }
                     });
            }
            else
            {
                item.UsedEngine = "Fast";
                Parallel.ForEach(item.SearchTerms, term =>
                    {
                        lock (fastSyncRoot)
                        {
                            CheckSearchTermInFast(term, item.Site, item.Language);
                        }
                    });
            }
        }
    }

Even though in guidelines it is mentioned for lock statement only to wrap as little amount of code as possible, here is how nested CheckSearchTerm method looks like:

private void CheckSearchTermInSeas(SearchQueryModel baseModel, FailedSearchTermModel term, string site, string language)
    {
        var projectId = ConstantsEnumerators.Constants.Projects.GetProjectIdByName(site);

        term.SearchTerm = ClearSearchTerm(term.SearchTerm).Replace("\"", string.Empty);
        var results = SearchInSeas(baseModel, term.SearchTerm, projectId, language);
        term.DidYouMean = GetDidYouMean(results?.Query.Suggestion, term.SearchTerm);
        term.HasResult = results?.NumberOfResults > 0;
        if (!term.HasResult && string.IsNullOrEmpty(term.DidYouMean))
        {
            results = SearchInSeas(baseModel, term.SearchTerm, projectId, InverseLanguage(language));
            term.WrongLanguage = results?.NumberOfResults > 0;
            if (!term.WrongLanguage)
            {
                term.DidYouMean = GetDidYouMean(results?.Query.Suggestion, term.SearchTerm);
            }
        }

        if (!string.IsNullOrEmpty(term.DidYouMean))
        {
            results = SearchInSeas(baseModel, term.DidYouMean, projectId, term.WrongLanguage ? InverseLanguage(language) : language);
            term.DidYouMeanHasResult = results?.NumberOfResults > 0;
            if (!term.DidYouMeanHasResult)
            {
                results = SearchInSeas(baseModel, term.DidYouMean, projectId, term.WrongLanguage ? language : InverseLanguage(language));
                term.DidYouMeanHasResult = results?.NumberOfResults > 0;
            }
        }
    }

Am I doing everyting right, can you please provide some explanation? Or should I change it? PS: Now if i need to write all this records into the file(excel) should I also use Parallel to increase performance? And if so approach would be the same?

Upvotes: 0

Views: 505

Answers (1)

Theodor Zoulias
Theodor Zoulias

Reputation: 43738

In ASP.NET applications, threads are a precious resource. The more threads you have available, the more requests you can serve concurrently. The threads for serving requests and for doing parallel work are coming from the same pool, the ThreadPool. So the more parallel work you do, the less concurrent clients you can serve. Doing parallel work with the Parallel.ForEach is particularly nasty when the loop is not configured with the MaxDegreeOfParallelism option. This beast can single-handedly saturate your ThreadPool by using every available thread the pool has, and requesting even more. One non-configured Parallel.ForEach in your web app is enough to reduce the scalability of your application to nothingness.

The second Parallel.ForEach in your code, the one that searches with the item.UsedEngine = "Fast" setting, is not configured.

And what all these threads are going to do? Almost nothing. At most one or two threads will be doing work, and all the others will be blocked behind the lock, waiting for their turn. That's not an efficient way to utilize the resources of your server. By using parallelism, you made your web app slower for everyone. If you have a performance problem in a web application, introducing parallelism should be your last thought as a solution. It's much more likely to intensify the problem than to solve it.

Upvotes: 1

Related Questions