Reputation: 720
I noticed that unique function from Rcpp orders the results
evalCpp("unique(IntegerVector::create(6,6,1,5,5,1))")
[1] 6 5 1
unique(c(6,6,1,5,5,1))
[1] 6 1 5
Is there a way to avoid this ? Thanks for your help
Upvotes: 1
Views: 485
Reputation: 11
You may use duplicated(_) == 0
to dedup your vector while keeping the order of first appearance:
IntegerVector x = {6,6,1,5,5,1};
IntegerVector out = x[duplicated(x) == 0];
Rcout << out << std::endl;
Upvotes: 0
Reputation: 26258
This is how I've implemented it, along with the issue I was trying to solve when I came up with it (using this answer, and it also shows various other solutions and benchmarks).
template < typename T, int RTYPE >
inline SEXP sexp_unique( Rcpp::Vector< RTYPE > x ) {
std::set< T > seen;
auto newEnd = std::remove_if( x.begin(), x.end(), [&seen]( const T value ) {
if ( seen.find( value ) != std::end( seen ) ) {
return true;
}
seen.insert( value );
return false;
});
x.erase( newEnd, x.end() );
return x;
}
// returns unique values in their original input order
inline SEXP get_sexp_unique( SEXP s ) {
SEXP s2 = Rcpp::clone( s );
switch( TYPEOF( s2 ) ) {
case LGLSXP: {
return sexp_unique< bool, LGLSXP >( s2 );
}
case REALSXP: {
return sexp_unique< double, REALSXP >( s2 );
}
case INTSXP: {
return sexp_unique< int, INTSXP >( s2 );
}
case STRSXP: {
return sexp_unique< char* , STRSXP >( s2 );
}
default: Rcpp::stop("unknown vector type");
}
return 0;
}
Upvotes: 1
Reputation: 368509
If you look at the (short) source file you see that it uses an internal class IndexHash
. I suspect this one sorts by default.
If original order is paramount, I supposed you could write yourself a new convenience wrapper. It cannot be all this hard: at the risk of wasting a few bytes of memory, allocate a temporary logical vector, use a standard hashmap and loop over the incoming vector. For each value, ask if the hashmap has seen this value, store the boolean answer. Then use that to index the original vector.
Chances are this is even implemented somewhere. Also look at Armadillo and Eigen for utility functions.
Upvotes: 2
Reputation: 2877
This one might help someone - works only for sorted vectors.
template <int ITYPE>
Rcpp::Vector<ITYPE> unique(Rcpp::Vector<ITYPE> x) {
int n = x.size();
if (n == 1) return(x);
Rcpp::Vector<ITYPE> res;
res.push_back(x(0));
for (int i = 1; i < n; i++) {
if (x[i] != x(i - 1)) {
res.push_back(x(i));
}
}
return res;
}
Upvotes: 0