Reed B
Reed B

Reputation: 646

Making a Parallel Sorting Algorithm Faster than Naive Optimized Quicksort?

As the title suggests, I need to make an algorithm that is faster than quicksort. The quicksort in question is optimized and used in a naive paralleled system so that a single thread completely does each quicksort but multiple threads are doing a quicksort at the same time. I need to make an algorithm that is faster than this process. Would it be faster to parallel each quicksort by having additional threads execute the sorting of each side of a pivot or would this process have too much overhead and end up causing a slow down? Any suggestions for an algorithm?

Upvotes: 0

Views: 803

Answers (2)

Jim Mischel
Jim Mischel

Reputation: 134005

If I understand correctly, you currently have a system in which N sorts are performed by N threads. Each individual sort is done by a single thread, but there can be multiple sorts running concurrently. You're asking if it would be faster to write a parallel sorting algorithm so that each sort is performed by multiple threads.

So let's say that you have four processors and you have to do 10 sorts. Assuming that each sort takes the same amount of time (unrealistic, but useful for discussion), then if each sort runs in a single thread you can have four sorts running at once. Call the time to perform one single-threaded sort one time period.

So the time to do all sorts will be three time periods. During two periods you have four sorts running concurrently, and during one period you have two concurrent sorts. You have excess capacity (two idle processors) during that last time period.

If you have a parallel sorting algorithm that uses four threads, then the best case is that each sort will take 1/4 the time of a single-threaded sort. So in theory you can perform the 10 sorts in 10/4 time periods, meaning that it will take only 2.5 time periods.

So in theory you could save half of a time period by going to a parallel sorting algorithm. But you won't realize that much of a performance gain because quicksort isn't 100% parallelizable; there are times when fewer than four threads will be involved. You have idle processors at random times during each sort. It's quite possible that using the parallel version will be slower overall because those small bits of idle time add up.

Think of it this way: you have four people who need to do 10 jobs. They can do those 10 jobs by each taking one and doing it individually, or by cooperating so that each of the four does 1/4 of the work on each job. There is no difference in the amount of work done. In the first case you have two idle workers while the last two jobs are being done. In the second case, you have some idle time while the last job is being done; worker 1 is idle 3/4 of the time period, worker 2 is idle for 1/2 the time period, and worker 3 is idle for 1/4 of the time period. So in theory your total worker idle time is 6/4, or 1.5 time periods.

But there's also idle time involved in moving a job from worker 1 to worker 2, etc. That transition time has both workers idle briefly. Those small bits of time (3 transitions per job, plus time for worker 1 to get the next job and start, and the time for worker 4 to deliver the finished product) add up, and could quite possibly exceed the .5 time periods that are ostensibly saved.

You could give it a try, though. It's pretty easy to parallelize quicksort. See, for example, http://reedcopsey.com/2010/02/26/parallelism-in-net-part-11-divide-and-conquer-via-parallel-invoke/.

Upvotes: 6

Dweeberly
Dweeberly

Reputation: 4777

This depends on the size and nature of the data. QS has worst case performance on sorted data (if memory serves). As you suggest you could provide a thread for each side of the pivot but you would want to limit that your partitions don't become too small. Please follow up and let us know how it goes, I'd be interested in knowing and I'm sure others would too.

Upvotes: 0

Related Questions