Reputation: 333

why quick sort implemented in rcpp works slow?

I have implemented quick sort algorithm in Rcpp, but it works significantly slower than sort(array, method="quick") for large arrays. Why?

Here is my Rcpp code

// partition using hoare's scheme

#include <Rcpp.h>
using namespace Rcpp;

int partition(NumericVector a,int start,int end)
{
    double pivot = a[end];
    int i = start - 1;
    int j = end + 1;
    //Rcout << a <<"\n";
    while(1)
    {
        do {
            i++;
        } while (a[i] < pivot);

        do {
            j--;
        } while (pivot < a[j]);

        if(i > j)
            return j;

        //special.Swap(a, i, j);
        std::swap(a[i], a[j]);
    }
}

void qsort(NumericVector a,int start,int end)
{
  //Rcout << start <<"," << end <<"\n";
  if(start < end)
  {
    int P_index = partition(a, start, end);
    //Rcout << P_index << "\n";
    qsort(a, start, P_index);
    qsort(a, P_index + 1, end);
  }
}


// [[Rcpp::export]]
NumericVector QuickSortH_WC(NumericVector arr)
{
  int len = arr.size();
  qsort(arr, 0, len-1);
  //Rcout << arr <<"\n";
  return 1;
}

Also for arrays with floating values, the algorithm is worse. I want to make a comparison with hoare's and lomuto partitioning scheme, But I do not know whether this implementation has any flaw in it for which algorithm works slower.

Upvotes: 2

Answers (2)

Ralf Stubner

Reputation: 26823

The main reason for the inefficiency of your code seems to be mixing of the two partitioning schemes you want to compare. You claim to use the Hoare partition scheme, and the code looks very much like it, but pivot is calculated according to the Lomuto partition scheme. In addition, you should return j if i >= j, not if i > j. Fixing these two things and replacing i++ with the slightly faster ++i I get:

// partition using hoare's scheme

#include <Rcpp.h>
using namespace Rcpp;

int partition(NumericVector a,int start,int end)
{
    double pivot = a[(start + end) / 2];
    int i = start - 1;
    int j = end + 1;
    //Rcout << a <<"\n";
    while(1)
    {
        do {
            ++i;
        } while (a[i] < pivot);

        do {
            --j;
        } while (pivot < a[j]);

        if(i >= j)
            return j;

        //special.Swap(a, i, j);
        std::swap(a[i], a[j]);
    }
}

void qsort(NumericVector a,int start,int end)
{
    //Rcout << start <<"," << end <<"\n";
    if(start < end)
    {
        int P_index = partition(a, start, end);
        //Rcout << P_index << "\n";
        qsort(a, start, P_index);
        qsort(a, P_index + 1, end);
    }
}


// [[Rcpp::export]]
NumericVector QuickSortH_WC(NumericVector arr)
{
    int len = arr.size();
    qsort(arr, 0, len-1);
    //Rcout << arr <<"\n";
    return arr;
}

/*** R
set.seed(42)
dat <- runif(1e6)
bench::mark(QuickSortH_WC(dat), sort(dat, method="quick"))
*/

Output

> bench::mark(QuickSortH_WC(dat), sort(dat, method="quick"))
# A tibble: 2 x 13
  expression                     min  median `itr/sec` mem_alloc `gc/sec` n_itr
  <bch:expr>                  <bch:> <bch:t>     <dbl> <bch:byt>    <dbl> <int>
1 QuickSortH_WC(dat)          95.7ms 100.5ms      8.63    2.49KB     43.2     5
2 sort(dat, method = "quick")   15ms  16.5ms     53.1    11.44MB     23.6    27
# … with 6 more variables: n_gc <dbl>, total_time <bch:tm>, result <list>,
#   memory <list>, time <list>, gc <list>
Warning message:
Some expressions had a GC in every iteration; so filtering is disabled.

So while this method is about a factor of 7 slow than R's sort, it has at least comparable order of magnitude for the run time. (Thanks @JosephWood for digging out the link). And Wikipedia lists even more improvements over these two schemas.

BTW, I also changed the wrapper function to return the changed array. This allows me to use the default behavior of bench::mark which is to compare the returned results. I find that useful ...

Upvotes: 3

MACHERKI M E

Reputation: 131

Rcpp apply recursive functions badly. I suggest iterative quick sort implementation:

void _Quick_sorti( double _array[],int _l,int _h){
    int *_stack=new int [_h-_l+1]; double _tmp;int _i,_p,_top=-1;
    _stack[++_top]=_l;_stack[++_top]=_h;
    while(_top>=0){
      _h=_stack[_top--];_l=_stack[_top--];
      _tmp=_array[_h];
      _i=_l-1;
      for(int _j=_l;_j<=_h-1;_j++){
        if(_array[_j]<=_tmp){_i++;std::swap(_array[_i],_array[_j]);}
      }
      _p=_i+1;
      std::swap(_array[_p],_array[_h]);
      if(_p-1>_l){_stack[++_top]=_l;_stack[++_top]=_p-1;}
      if(_p+1<_h){_stack[++_top]=_p+1;_stack[++_top]=_h;}
    }
  delete _stack;
  }


     // [[Rcpp::export]]
SEXP Quick_sorti(SEXP &unsorted) { //run
  SEXP temp=clone(unsorted);// or Rf_duplicate
  double *z=REAL(temp);
  int N=LENGTH(temp)-1;
  int k=0;
  _Quick_sorti(z,k,N); // note that we have provide lvalue (if we put 0 it will not works int place of N)
  return temp;}

The code is adapted from a macros that include '_' prefix and look ugly moreover it use R internals. Adding stack imply N more memory requirement.

Upvotes: 1

why quick sort implemented in rcpp works slow?

Answers (2)

Related Questions