user2110714
user2110714

Reputation:

Running time of counting sort

I am trying to understand the running time of counting sort. In my notes, it says, assuming the size of the array A is n, and k is the number of times each number occurs,

 Counting-Sort(A,k) {
   for (i=1; i <= k; i++) // initialize number counters to 0
       times[i] = 0;

   for (j=1; j <= length[A]; j++) // decide how many times each
       times[A[j]]++;                  // number appears in the input

  // form the sorted array
  m=1;
    for ( i=1; i <= k; i++)    // consider each number in the range
         for ( j=1; j <= times[ i ]; j++) { // generate that number in 
            A[m]=i;                   // the output as many times as
            m++;                      // it occurs in the input
         }
  }

Let ti denote the number of times the inner loop iterates for each i. When we look at the nested for loops at the bottom of the code, note that, every time the inner loop is iterated, we are placing a new number into its correct place in the output array. Hence: sum ti (from i=1 to k) = n.

I don't understand why this sum is equal to n. The outer loop iterates k times, and inner loop may iterate at most n times, so it must be O(nk). Can somebody explain? Thanks

Upvotes: 4

Views: 5373

Answers (3)

Khaled.K
Khaled.K

Reputation: 5940

Algorithm

Counting Sort, known also as Histogram Sort

n = length of input Array

k = number of unique symbols that appear in the input Array
  • Initialization takes k time

  • Counting takes n time

  • Enumeration takes Sum { Count(i) } = n time

Complexity

Time = k + n + n = 2n+k

Time ~ O(2n+k) = O(n+k)

Space = n + k ~ O(n+k)

Upvotes: 3

artahian
artahian

Reputation: 2093

As you can see for the loop

m=1;
for ( i=1; i <= k; i++)    // consider each number in the range
     for ( j=1; j <= times[ i ]; j++) { // generate that number in 
        A[m]=i;                   // the output as many times as
        m++;                      // it occurs in the input
     }

its complexity will be equal to how many times the body of inner cycle is executed. When i = 1, it is executed times[1] times, when i = 2 => times[2] times, and so, when i = k, times[k] times. So in total inner body is executed times[1] + times[2] + ... + times[k] times, and that sum is equal to n, as each element is counted once.

Upvotes: 0

Jan Kr&#252;ger
Jan Kr&#252;ger

Reputation: 18530

The notes aren't entirely correct. k is the highest number that shows up in your array (i.e. it has numbers from 1 to k). So, let's step through the algorithm:

  1. Initialize the k "bins": O(k)
  2. Count how often each number occurs: O(n)
    • The sum of the values in all bins is exactly n, because that's how many entries we had in total in the array
  3. Iterate over the bins: O(k)
    • Set just as many elements in the result array as the bin represents: seemingly O(n)

The important thing here is that we know that even though we iterate over k bins and there could, in general, be up to n values represented by each bin, we set up the bins so that, across all the iterations of the outer loop, the inner loop still runs for a total of n times. Hence, the total complexity of this step is actually O(n).

If we ignored our extra knowledge about the contents of the bins, you'd be absolutely right.

Now there's a final twist: if k > n, it's actually in O(k) rather than O(n)! Because now the outer loop is the thing that "runs more often".

I hope this makes some sense.

Upvotes: 0

Related Questions