Mavershang
Mavershang

Reputation: 1278

Multithreading speed issue

I added multithreading part to my code .

 public class ThreadClassSeqGroups
    {
        public Dictionary<string, string> seqGroup;
        public Dictionary<string, List<SearchAlgorithm.CandidateStr>> completeModels;
        public Dictionary<string, List<SearchAlgorithm.CandidateStr>> partialModels;
        private Thread nativeThread;

        public ThreadClassSeqGroups(Dictionary<string, string> seqs)
        {
            seqGroup = seqs;
            completeModels  = new Dictionary<string, List<SearchAlgorithm.CandidateStr>>();
            partialModels   = new Dictionary<string, List<SearchAlgorithm.CandidateStr>>();
        }

        public void Run(DescrStrDetail dsd, DescrStrDetail.SortUnit primarySeedSu,
            List<ushort> secondarySeedOrder, double partialCutoff)
        {
            nativeThread = new Thread(() => this._run(dsd, primarySeedSu, secondarySeedOrder, partialCutoff));
            nativeThread.Priority = ThreadPriority.Highest;
            nativeThread.Start();
        }

        public void _run(DescrStrDetail dsd, DescrStrDetail.SortUnit primarySeedSu,
            List<ushort> secondarySeedOrder, double partialCutoff)
        {
            int groupSize = this.seqGroup.Count;
            int seqCount = 0;
            foreach (KeyValuePair<string, string> p in seqGroup)
            {
                Console.WriteLine("ThreadID {0} (priority:{1}):\t#{2}/{3} SeqName: {4}",
                    nativeThread.ManagedThreadId, nativeThread.Priority.ToString(), ++seqCount, groupSize, p.Key);
                List<SearchAlgorithm.CandidateStr> tmpCompleteModels, tmpPartialModels;
                SearchAlgorithm.SearchInBothDirections(
                        p.Value.ToUpper().Replace('T', 'U'), dsd, primarySeedSu, secondarySeedOrder, partialCutoff,
                        out tmpCompleteModels, out tmpPartialModels);
                completeModels.Add(p.Key, tmpCompleteModels);
                partialModels.Add(p.Key, tmpPartialModels);
            }
        }

        public void Join()
        {
            nativeThread.Join();
        }

    }

class Program
{
    public static int _paramSeqGroupSize = 2000;
    static void Main(Dictionary<string, string> rawSeqs)
    {
        // Split the whole rawSeqs (Dict<name, seq>) into several groups
        Dictionary<string, string>[] rawSeqGroups = SplitSeqFasta(rawSeqs, _paramSeqGroupSize);


        // Create a thread for each seqGroup and run
        var threadSeqGroups = new MultiThreading.ThreadClassSeqGroups[rawSeqGroups.Length];
        for (int i = 0; i < rawSeqGroups.Length; i++)
        {
            threadSeqGroups[i] = new MultiThreading.ThreadClassSeqGroups(rawSeqGroups[i]);
            //threadSeqGroups[i].SetPriority();
            threadSeqGroups[i].Run(dsd, primarySeedSu, secondarySeedOrder, _paramPartialCutoff);
        }

        // Merge results from threads after the thread finish
        var allCompleteModels   = new Dictionary<string, List<SearchAlgorithm.CandidateStr>>();
        var allPartialModels    = new Dictionary<string, List<SearchAlgorithm.CandidateStr>>();
        foreach (MultiThreading.ThreadClassSeqGroups t in threadSeqGroups)
        {
            t.Join();
            foreach (string name in t.completeModels.Keys)
            {
                allCompleteModels.Add(name, t.completeModels[name]);
            }
            foreach (string name in t.partialModels.Keys)
            {
                allPartialModels.Add(name, t.partialModels[name]);
            }
        }
    }
}

However, the speed with multiple threads is much slower than single thread, and the CPU load is generally <10%.

For example:

The input file contain 2500 strings

_paramGroupSize = 3000, main thread + 1 calculation thread cost 200 sec

_paramGroupSize = 400, main thread + 7 calculation threads cost much more time (I killed it after over 10 mins run).

Is there any problem with my implementation? How to speed it up?

Thanks.

Upvotes: 0

Views: 138

Answers (3)

Tudor
Tudor

Reputation: 62439

It seems to me that you are trying to process a file in parallel with multiple threads. This is a bad idea, assuming you have a single mechanical disk.

Basically, the head of the disk needs to seek the next reading location for each read request. This is a costly operation and since multiple threads issue read commands it means the head gets bounced around as each thread gets its turn to run. This will drastically reduce performance compared to the case where a single thread is doing the reading.

Upvotes: 3

Peter Ritchie
Peter Ritchie

Reputation: 35881

When threads are run they are given time on a specific processor. if there are more threads than processors, the system context switches between threads to get all active threads some time to process. Context switching is really expensive. If you have more threads than processors most of the CPU time can be take up by context switching and make a single-threaded solution look faster than a multi thread solution.

Your example shows starting an indeterminate number of threads. if SplitSeqFasta returns more entries than cores, you will create more threads and cores and introduce a lot of context switching.

I suggest you throttle the number of threads manually, or use something like the thread parallel library and the Parallel class to have it automatically throttle for you.

Upvotes: 0

Dan Puzey
Dan Puzey

Reputation: 34198

What was the code prior to multithreading? It's hard to tell what this code is doing, and much of the "working" code seems to be hidden in your search algorithm. However, some thoughts:

  1. You mention an "input file", but this is not clearly shown in code - if your file access is being threaded, this will not increase performance as the file access will be the bottleneck.
  2. Creating more threads than you have CPU cores will ultimately reduce performance (unless each thread is blocked waiting on different resources). In your case I would suggest that 8 total threads is too many.
  3. It seems that a lot of data (memory) access might be done through your class DescrStrDetail which is passed from variable dsd in your Main method to every child thread. However, the declaration of this variable is missing and so its usage/implementation is unknown. If this variable has locks that prevent multiple threads accessing at the same time, then your multiple threads will potentially be locking eachother out of this data, further slowing performance.

Upvotes: 0

Related Questions