Joji
Joji

Reputation: 5646

the running time for an already sorted array being sorted by selection sort algorithm Vs the time for a reversed sorted array being sorted

I'm trying to figuring out the running time for using selection sort algorithm to sort an already sorted array (e.g. 1,2,3,4,5,..) and the time for using that to sort a reversed array (e.g. 5,4,3,2..). The weird thing I found about this is that on my computer it takes more time to sort an already sorted array than sorting a reversed array. From what I learnt I think it should be the other way around.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
void selectionsort(int A[], int n) {
    int min;
    for (int i = 0; i < n - 1; i++) {
        min = i;
        for (int k = i + 1; k < n; k++) {
            if (A[k] < A[min]) {
                min = k;
            }
        }
        int temp = A[i];
        A[i] = A[min];
        A[min] = temp;
    }
}

void sort(int A[], int n) {
    for (int i = 0; i < n; i++) {
        A[i] = i + 1;
    }
}

void resver_sort(int A[], int n) {
    for (int i = 0; i < n; i++) {
        A[i] = n - i;
    }
}

int main() {
    clock_t start, end;
    double cpu_time_used;

    int A[20000] = { 0 };
    int B[40000] = {0};
    int C[100000] = {0};

    printf("Selection Sort, Sorted Array\n");

    sort(A, 20000);
    start = clock(); // start the clock
    selectionsort(A, 20000);
    end = clock(); // stop the clock
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC; // calculate the actual time used
    printf("array size:20000 time:%f\n", cpu_time_used);

    sort(B, 40000);
    start = clock();
    selectionsort(B, 40000);
    end = clock();
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
    printf("array size:40000 time:%f\n", cpu_time_used);

    sort(C, 100000);
    start = clock();
    selectionsort(C, 100000);
    end = clock();
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
    printf("array size:100000 time:%f\n", cpu_time_used);

    printf("Selection Sort, reverse sorted Array\n");

    resver_sort(A, 20000);
    start = clock(); 
    selectionsort(A, 20000);
    end = clock(); 
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC; 
    printf("array size:20000 time:%f\n", cpu_time_used);

    resver_sort(B, 40000);
    start = clock();
    selectionsort(B, 40000);
    end = clock();  
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
    printf("array size:40000 time:%f\n", cpu_time_used);

    resver_sort(C, 100000);
    start = clock();    
    selectionsort(C,100000);
    end = clock();   
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
    printf("array size:100000 time:%f\n", cpu_time_used);
}

The result is

Selection Sort, Sorted Array
array size:20000 time:0.530281
array size:40000 time:2.109836
array size:100000 time:13.197117
Selection Sort, reverse sorted Array
array size:20000 time:0.500338
array size:40000 time:2.016468
array size:100000 time:12.830447
Program ended with exit code: 0

The first one which is an already sorted array takes more time. It doesn't make sense. I spent a lot of time debugging and trying to print these arrays to look into them but didn't figure it out.

Upvotes: 1

Views: 394

Answers (1)

chqrlie
chqrlie

Reputation: 144969

There are multiple problems in your code:

  • you do not include <stdio.h> and <time.h>
  • you do not define arrays B and C. I suggest you use arrays with static or dynamic (heap) storage rather than automatic to avoid stack overflow.
  • function sort has a buffer overflow in: for (int i = 0; i <= n; i++) it should be i < n.

Regarding the timings, your sorting function selectionsort performs exactly the same number of comparisons and swaps, except for the outcome of so the timings are bound to be close, only differing in the result of A[k] < A[min]. In the sorted case this test is always false, and it varies for the other case, the number of true cases linearly decreasing from n - 2 to 0. Depending on how the code is generated for this loop and how efficient the branch prediction feature of the CPU, you might get a small advantage for one or the other, Only careful timing will tell you that. Its seems from your results that the cost of a branch never taken more than compensates for the extra min = i store.

Your timings are actually pretty close (less than 6% difference). I get the same results, but multiple runs of the same program give timings that vary by the same order of magnitude... making it hard to draw definite conclusions.

Different compilers and different CPUs might produce an opposite result.

An initial pass to check for already sorted input is worthwhile as it adds very little to the timing of an inefficient sort like insertionsort. it would make these special cases almost instantaneous: 0.00001, 0.00002 and 0.00005 for your example.

More efficient sorting algorithms will also dwarf these timings: heapsort, mergesort, quicksort, radix sort should all be between 100 and 1000 times faster on your set sizes.

Upvotes: 1

Related Questions