user3612934
user3612934

Reputation: 19

Homework: Creating O(n) algorithm for sorting

I am taking the cs50 course on edx and am doing the hacker edition of pset3 (in essence it is the advanced version). Basically the program takes a value to be searched for as the command-line argument, and then asks for a bunch of numbers to be used in an array. Then it sorts and searches that array for the value entered at the command-line. The way the program is implemented, it uses a pseudo-random number generator to feed the numbers for the array.

The task is to write the search and sorting functions. I already have the searching function, but the sorting function is supposed to be O(n). In the regular version you were supposed to use a O(n ^ 2) algorithm which wasn't a problem to implement. Also using a log n algorithm wouldn't be an issue either. But the problem set specifically ask's for a big O(n) algorithm.

It gives a hint in saying that, since no number in the array is going to be negative, and the not greater than LIMIT (the numbers output by the generator are modulo'd so they are not greater than 65000). But how does that help in getting the algorithm to be O(n)?

But the counting sort algorithm, which purports to be an acceptable solution, returns a new sorted array rather than actually sort the original one, and that contradicts with the pset specification that reads 'As this return type of void implies, this function must not return a sorted array; it must instead "destructively" sort the actual array that it’s passed by moving around the values therein.'

Also, if we decide to copy the sorted array onto the original one using another loop, with so many consecutive loops, I'm not sure if the sorting function can be considered to have a running time of O(n) anymore. Here is the actual pset, the question is about the sorting part.

Any ideas to how to implement such an algorithm would be greatly appreciated. It's not necessary to provide actual code, rather just the logic of you can create a O(n) algorithm under the conditions provided.

Upvotes: 1

Views: 695

Answers (2)

Gene
Gene

Reputation: 46960

As @DarkCthulhu said, counting sort is clearly what they were urging you to use. But you could also use a radix sort.

Here is a particularly concise radix 2 sort that exploits a nice connection to Gray codes. In your application it would require 16 passes over the input, one per data bit. For big inputs, the counting sort is likely to be faster. For small ones, the radix sort ought to be fster because you avoid initializing 256K bytes or more of counters.

See this article for explanation.

void sort(unsigned short *a, int len) 
{  
  unsigned short bit, *s = a, *d = safe_malloc(len * sizeof *d), *t;  
  unsigned is, id0, id1;

  for (bit = 1; bit; bit <<= 1, t = s, s = d, d = t)
    for (is = id0 = 0, id1 = len; is < len; ++is)
      if (((s[is] >> 1) ^ s[is]) & bit)
        d[--id1] = s[is];
      else
        d[id0++] = s[is];
  free(d);
}

Upvotes: 2

Anirudh Ramanathan
Anirudh Ramanathan

Reputation: 46728

It gives a hint in saying that, since no number in the array is going to be negative, and the not greater than LIMIT (the numbers outputted by the generator are modulo'd to not be higher than 65000). But how does that help in getting the algorithm to be O(n).

That hint directly seems to point towards counting sort. You create 65000 buckets and use them to count the number of occurrences of each number. Then, you just revisit the buckets and you have the sorted result.

It takes advantage of the fact that:

  1. They are integers.
  2. They have a limited range.

Its complexity is O(n) and as this is not a comparison-based sort, the O(nlogn) lower bound on sorting does not apply. A very good visualization is here.

Upvotes: 5

Related Questions