Aht
Aht

Reputation: 593

Is parallel.foreach is best way to improve speed of executing this code

I have code that i need to rewrite to improve speed of executing original code :

Data class :

public class Data
{
    public string Id {get;set;}
    ... Other properties 
}

Services : ( There are more than 2 i jus give u 2 for example )

public class SomeService1
{
    public Result Process(Data data)
    {
        //Load data from different services hire 
    }
}

public class SomeService2
{
    public Result Process(Data data)
    {
        //Load data from different services hire 
    }
}

Actual method

public void Calculate (List<Data> datas)
{
    Result result;
    SomeService1 someService1 = new SomeService1();
    SomeService2 someService2 = new SomeService2();
    // In this place list of data have about 2000 elements 
    foreach(var data in datas)
    {
        switch(data.Id)
        {
            case 1:
                result = someService1.Process(data)
                break;
            case 2:
                result = someService2.Process(data)
                break;
            default:
                result = null;
        }
    ProcesAndSaveDataToDatabase(result);
    }
}

The method Calculate i taking List as parameter for every element in this list it grabing data from outside service ( service is determine by id in Data class ). Then it process this data and saving to database. For 2000 elements whole operation is taking about 8 min. 70 % of the time is gathering data from outside service. I must change that time. I have only one idea to do this but to be honest i can't test it with data because only data are on Production environment ( and testing on production is bad idea ). I have one idea. Can you look at it and advice to me if i going in right direction ?

Data class :

public class Data
{
    public string Id {get;set;}
    ... Other properties 
}

Services : ( There are more than 2 i jus give u 2 for example )

public class SomeService1 : IService
{
    public Result Process(Data data)
    {
        //Load data from different services hire 
    }
}

public class SomeService2 : IService
{
    public Result Process(Data data)
    {
        //Load data from different services hire 
    }
}

IService :

public interface IService
{
    Result Process(Data data);
}

Actual method :

Public void Calculate (List<Data> datas)
{
    var split= from data in datas group data by data.Id into newDatas select newDatas
    // Different list split by Id
    Parallel.Foreach(split, new ParallelOptions{MaxDegreeOfParallelism = 4}, datas => 
    {
        Result result;
        IService service = GetService(datas.FirsOfDefault().Id);
        if(service == null) return;
        foreach(var data in datas)
        {
            result = service.Process(data)
            ProcesAndSaveDataToDatabase(result);
        }
    }); 
}

private IService GetService(string id)
{
      IService service = null;
      if(id == null ) return service;
      switch(id)
      {
           case 1:
                service = new SomeService1();
                break;
           case 2: 
                service = new SomeService2();
                break;
      }
      return service;
 }

In this idea i try to split the different services data to different threads. So in in list we will have 20 items with Id = 1 and 10 items with Id = 2 it should create 2 separated thread and process it discretely this should allow me to cut off the execute time. Is this is good way ? Is any other possibilities to improve this code ?

Thanks

Upvotes: 0

Views: 333

Answers (2)

Narayana Erukulla
Narayana Erukulla

Reputation: 104

While you'll reap the benefits of using Parallelism(Parallel.ForEach) in your application, that is not the only way of improving the speed of executing the code.

Also, since you are using LINQ in your application and you might be using it extensively as well, you may well want to use PLINQ(Parallel LINQ) wherever possible.

I'd also suggest that you try profiling your code, to identify the hotspots and bottlenecks in your application, which might give you a better idea of understanding where and how you can improve the performance.

Also, as mentioned by Patrick you should try using async and await wherever possible.

Check out this article from MSDN that'll give you more insights https://msdn.microsoft.com/en-us/library/ff963552.aspx

Upvotes: 2

Patrick Huber
Patrick Huber

Reputation: 756

Parallel ForEach helps improve CPU bound tasks but you mention above you are calling services in parallel which is IO bound. Whenever you do IO bound work (like calling a external service) you are better off using async and await instead of parallel foreach.

Parallel ForEach will spin up multiple threads and block those threads until the work is done (approx 8 min with all threads blocked).

Async and Await will weave worker threads between service calls and effectively use IO completion ports to call back into your application. This avoids blocking of multiple threads and allows you to more efficiently use your computer's resources.

More info on how to make parallel asynchronous calls here:

https://msdn.microsoft.com/en-us/library/mt674880.aspx

Upvotes: 3

Related Questions