Reputation: 68
I'm working on developing sorting algorithms and was surprised to find c's qsort taking 1.6x as long Julia's default sorting algorithm. I imagine I'm making some sort of benchmarking mistake. Here are my benchmarking programs and their results:
Julia:
# time (julia bench.jl)
using Printf
function main()
len = 100_000_000
x = rand(Int64, len)
t = @elapsed sort!(x)
@printf "%d elements:\nclaim\t%fs" len t
end
main()
c
// time (gcc -O3 bench.c && ./a.out)
#include <stdlib.h>
#include <stdio.h>
#include <sys/time.h>
int comp (const void * elem1, const void * elem2)
{
int f = *((int*)elem1);
int s = *((int*)elem2);
if (f > s) return 1;
if (f < s) return -1;
return 0;
}
long long utime()
{
struct timeval now_time;
gettimeofday(&now_time, NULL);
return now_time.tv_sec * 1000000LL + now_time.tv_usec;
}
int main(int argc, char* argv[])
{
long length = 100000000;
long long *x;
x = (long long *) malloc(length * sizeof(long long));
if (x == NULL)
{
printf("Malloc failed\n");
return 1;
}
for (long cnt = 0 ; cnt < length ; cnt++)
x[cnt] = rand();
long long start = utime();
qsort (x, length, sizeof(*x), comp);
long long end = utime();
//for (long cnt = 0 ; cnt < length ; cnt += length/10)
// printf("%lld\n", x[cnt]);
free(x);
printf ("%ld elements:\nclaim\t%fs", length, (end-start)/1000000.0);
return 0;
}
Results
bash-3.2$ time (julia bench.jl)
100000000 elements:
claim 12.405531s
real 0m16.560s
user 0m13.883s
sys 0m1.297s
bash-3.2$ time (gcc -O3 bench.c && ./a.out)
100000000 elements:
claim 20.592641s
real 0m24.604s
user 0m21.352s
sys 0m2.479s
Is it true that Julia's algorithm (median of 3 quicksort with an insertion sort base case for less than 20 elements) is substantially faster than c's qsort? Can I sort faster than qsort in c?
Upvotes: 4
Views: 246
Reputation: 241791
It's easy to sort faster than C's qsort
. You could, for example, use C++'s std::sort
. The C++ library is not faster because it uses a better algorithm; rather, it's because C++'s generics allow the compiler to avoid the overhead of calling the comparison function and a smaller overhead in qsort's swap, which needs to handle elements of arbitrary size.
In the following, the only difference between sortbench-c and sortbench-cc is the use of std::sort
in the latter:
$ diff sortbench-c.c sortbench-cc.cc
1c1
< // time (gcc -O3 sortbench-c.c && ./a.out)
---
> // time (gcc -O3 sortbench-cc.cc && ./a.out)
2a3
> #include <algorithm>
7,14d7
< int comp (const void * elem1, const void * elem2)
< {
< int f = *((int*)elem1);
< int s = *((int*)elem2);
< if (f > s) return 1;
< if (f < s) return -1;
< return 0;
< }
38c31
< qsort (x, length, sizeof(*x), comp);
---
> std::sort(x, x+length);
The difference is dramatic:
$ time (gcc -O3 sortbench-c.c && ./a.out)
100000000 elements:
claim 16.673827s
real 0m17.774s
user 0m17.387s
sys 0m0.379s
$ time (gcc -O3 sortbench-cc.cc && ./a.out)
100000000 elements:
claim 9.948971s
real 0m11.133s
user 0m10.926s
sys 0m0.204s
Upvotes: 4
Reputation: 33601
The problem is that the rand
functions are [probably] different.
Quicksort is data/order dependent. For example, mergesort will always execute in the same amount of time, regardless of what data it is sorting.
However, quicksort's time will vary depending upon the data.
To do a proper benchmark, do not use rand
unless you write them yourself or guarantee that Julia's version and libc's version are exactly the same.
I'd write an initialization function for both langs. For example, the requisite for (i = 0; i < length; ++i) array[i] = length - i;
or some such, so that the initial data is guaranteed to be the same.
You can use a random function if you have one program generate the array and save it to a file. The other program can then read in the [same] data.
Sometimes, I write a separate program that generates the input data, and saves it to a file. Then, I pass that file off to both programs. This decouples the test data generation from the programs under test.
Upvotes: 0
Reputation: 3812
There is no performance guarantee for qsort:
Despite the name, neither C nor POSIX standards require this function to be implemented using quicksort or make any complexity or stability guarantees.
To do a proper sorting benchmark between Julia and C, you will need another implementation.
Upvotes: 0