Reputation: 68750
Assume I have a vector<int>
intVec
and a vector<vector<double> >
matrix
. I want to sort intVec
and reorder the first dimension of matrix
correspondingly in C++. I realize this question has been asked several times before, but this case has a twist. A vector<double>
is expensive to copy, so e.g. copying both intVec
and matrix
to a vector<pair<int, vector<double> >
, sorting that and copying them back is even more inefficient than usual.
Short of rolling my own custom sorting algorithm, how can I sort intVec
and reorder the first dimension of matrix
in lockstep without copying any element of matrix
and invoking vector
's copy constructor?
Upvotes: 5
Views: 2225
Reputation: 4612
I'm pretty sure that --- when you use some up to date compiler (let's say gcc 4.4 and up) --- there nothing really copied: nowadays objects within C++ standard library containers are (mostly) always moved. Therefore IMHO there is no need to have any fear about expensive copies.
Have a look at the following example - it was written using gcc 4.4.6 under Debian. As you can see, during the 'reordering' phase, there is no call to the copy constructor and also no call to the `operator=(... & other)'.
#include <vector>
#include <iostream>
#include <iomanip>
class VeryExpensiveToCopy {
public:
explicit VeryExpensiveToCopy(long i) : id(i) { ++cnt_normal_cstr; }
// Just to be sure this is never used.
VeryExpensiveToCopy & operator=(VeryExpensiveToCopy & other) = delete;
VeryExpensiveToCopy(VeryExpensiveToCopy & other) = delete;
VeryExpensiveToCopy(VeryExpensiveToCopy && other) : id(other.id) {
++cnt_move_cstr;
}
VeryExpensiveToCopy & operator=(VeryExpensiveToCopy && other) {
id = other.id; ++cnt_op_as_move; return *this;
}
long get_id() const { return id; }
static void print_stats(std::string const & lref) {
std::cout << "[" << std::setw(20) << lref << "] Normal Cstr ["
<< cnt_normal_cstr
<< "] Move Cstr [" << cnt_move_cstr
<< "] operator=(&&) [" << cnt_op_as_move << "]" << std::endl;
}
private:
long id;
static long cnt_normal_cstr;
static long cnt_move_cstr;
static long cnt_op_as_move;
};
// Counts the number of calls.
long VeryExpensiveToCopy::cnt_normal_cstr { 0 };
long VeryExpensiveToCopy::cnt_move_cstr { 0 };
long VeryExpensiveToCopy::cnt_op_as_move { 0 };
int main() {
std::vector<VeryExpensiveToCopy> v;
VeryExpensiveToCopy::print_stats("Start");
for(auto i(0); i<100000; ++i) {
v.emplace_back(i);
}
VeryExpensiveToCopy::print_stats("After initialization");
for(auto i(0); i<100000-1; ++i) {
v[i] = std::move(v[i+1]);
}
VeryExpensiveToCopy::print_stats("After moving");
for(auto i(0); i<100000-1; ++i) {
if(v[i].get_id() != i+1) { abort(); }
}
VeryExpensiveToCopy::print_stats("After check");
return 0;
}
Output:
[ Start] Normal Cstr [0] Move Cstr [0] operator=(&&) [0]
[After initialization] Normal Cstr [100000] Move Cstr [131071] operator=(&&) [0]
[ After moving] Normal Cstr [100000] Move Cstr [131071] operator=(&&) [99999]
[ After check] Normal Cstr [100000] Move Cstr [131071] operator=(&&) [99999]
Upvotes: 0
Reputation: 35934
Since std::vector::swap
operates in constant time, you could use a sorting algorithm that operates via a series of swaps (like quicksort) to sort intVec
while simultaneously performing identical swaps on matrix
:
#include <iostream>
#include <vector>
#include <algorithm>
// Sorts intVec in [low, high) while also performing identical swaps on matrix.
void iqsort(std::vector<int> &intVec, std::vector<std::vector<double>> &matrix,
int low, int high) {
if (low >= high) return;
int pivot = intVec[low];
int nLow = low + 1;
int nHigh = high - 1;
while (nLow <= nHigh) {
if (intVec[nLow] <= pivot) {
++nLow;
} else {
std::swap(intVec[nLow], intVec[nHigh]);
std::swap(matrix[nLow], matrix[nHigh]);
--nHigh;
}
}
std::swap(intVec[low], intVec[nHigh]);
std::swap(matrix[low], matrix[nHigh]);
iqsort(intVec, matrix, low, nHigh);
iqsort(intVec, matrix, nLow, high);
}
int main() {
std::vector<int> intVec = {10, 1, 5};
std::vector<std::vector<double>> matrix = {{33.0}, {11.0}, {44.0}};
iqsort(intVec, matrix, 0, intVec.size());
// intVec is {1, 5, 10} and matrix is {{11.0}, {44.0}, {33.0}}
}
Upvotes: 0
Reputation: 279285
A
vector<double>
is expensive to copy, so e.g. copying both intVec and matrix to avector<pair<int, vector<double> >
, sorting that and copying them back is even more inefficient than usual.
The simplest way to get the optimization you want then is to swap the elements of the source vector<vector<double>>
into the temporary vector<pair<int, vector<double>>
, sort it, and then swap them back to their new locations in the original vector.
There will still be more overhead than is strictly necessary (for example to construct and destroy empty vectors). However, no vector is ever copied and the code remains very similar to what you already have. So if you are correct that the problem is the cost of copying, problem solved.
In C++11 you could move in both directions rather than swapping. I doubt there's much performance difference between moving and swapping with an empty vector, but I'm not sure there isn't.
Upvotes: 4
Reputation: 96261
The obvious answer is to restructure your two vectors as one vector<pair<int, vector<double> >
(since the data are clearly tightly coupled aready).
If that's really not an option then create another vector of indexes and sort that instead of the vec and matrix.
Upvotes: 0
Reputation: 145289
If you don't want to copy the vector<double>
item vector, then make a vector of pointer or indices to the vector<double>
items. Sort it along with the main vector.
It is however not at all clear that you'll get a performance gain, and so I suggest that you measure both the straightforward sorting and the smart sorting, and compare.
Example:
#include <algorithm>
#include <vector>
#include <iostream>
using namespace std;
struct Mat
{
vector< vector< double > > items;
Mat( int const size )
: items( size, vector< double >( size ) )
{}
};
struct KeyAndData
{
int key;
vector< double > const* data;
friend bool operator<( KeyAndData const& a, KeyAndData const& b )
{
return a.key < b.key;
}
};
int main()
{
int const data[] = {3, 1, 4, 1, 5};
Mat m( 5 );
vector<int> v( 5 );
for( int i = 0; i < 5; ++i )
{
m.items[i][i] = v[i] = data[i];
}
vector< KeyAndData > sorted( 5 );
for( int i = 0; i < 5; ++i )
{
sorted[i].key = v[i];
sorted[i].data = &m.items[i];
}
sort( sorted.begin(), sorted.end() );
for( int i = 0; i < 5; ++i )
{
cout << sorted[i].key << ": ";
vector< double > const& r = *sorted[i].data;
for( int x = 0; x < 5; ++x )
{
cout << r[x] << " ";
}
cout << endl;
}
}
Upvotes: 0
Reputation: 14205
Based on the space between your two >
's, I'm guessing you're using pre-C++11 C++. In C++11, std::sort
seems to move elements whenever possible instead of copying.
You can pass a custom comparator to std::sort
. However, even if you do this, you're doing Theta(n log n) copies of pair<int, vector<double> >
's.
I'd guess, based on not actually trying it, that you should sort a pair<int, vector<double> *>
(or pair<int, int>
if int
is big enough), using the second int
as an index) instead to get the appropriate permutation and then apply the permutation using vector
's swap
member function to avoid copying the vector contents.
Upvotes: 1
Reputation: 15944
One option: create a std::vector<std::pair<int,size_t>>
where the first element is the int from intVec and the second element is the original index of that element. Then sort that new vector. Then shuffle your matrix and intVec to the order indicated by the second elements of the pairs (e.g., with a single pass, doing swaps).
Upvotes: 1