Reputation: 2973
I have a vector<uint64_t> keys
and a vector<char> vals
, both of size N
. I would like to sort keys
and vals
based on entries in keys
.
An obvious solution is copying into a vector<pair<uint64_t, char>>
, sorting that, and copying the sorted data back out, but I would like to avoid copying, and I would like to avoid the alignment padding: sizeof(pair<uint64_t, char>)
is 2*sizeof(uint64_t)
, or 16 bytes, due to alignment; much more than the 9 bytes needed.
In other words, although the following C++11 implementation is correct, it is not efficient enough:
#include <algorithm>
#include <tuple>
using namespace std;
void aux_sort(vector<uint64_t> & k, vector<char> & v) {
vector<pair<uint64_t, char> > kv(k.size());
for (size_t i = 0; i < k.size(); ++i) kv[i] = make_pair(k[i], v[i]);
sort(kv.begin(), kv.end());
for (size_t i = 0; i < k.size(); ++i) tie(k[i], v[i]) = kv[i];
}
Although the following C++11 implementation is correct, I want to use std::sort
instead of hand-coding my own sorting algorithm:
#include <algorithm>
using namespace std;
void aux_sort(vector<uint64_t> & k, vector<char> & v) {
for (size_t i = 0; i < k.size(); ++i)
for (size_t j = i; j--;)
if (k[j] > k[j + 1]) {
iter_swap(&k[j], &k[j + 1]);
iter_swap(&v[j], &v[j + 1]);
}
}
(Edit to add, in response to @kfsone) Although the following implementation is correct, it is not in-place, since permutation according to indices
needs a copy (or alternatively, a prohibitively complex linear time in-place permutation algorithm that I am not going to implement):
#include <algorithm>
#include <tuple>
using namespace std;
void aux_sort(vector<uint64_t> & k, vector<char> & v) {
vector<size_t> indices(k.size());
iota(indices.begin(), indices.end(), 0);
sort(indices.begin(), indices.end(),
[&](size_t a, size_t b) { return k[a] < k[b]; });
vector<uint64_t> k2 = k;
vector<char> v2 = v;
for (size_t i = 0; i < k.size(); ++i)
tie(k[i], v[i]) = make_pair(k2[indices[i]], v2[indices[i]]);
}
What is the easiest way to apply STL algorithms such as std::sort
to a sequence of key/value-pairs in-place, with keys and values stored in separate vectors?
Background: My application is reading large (40 000 by 40 000) rasters that represent terrains, one row at a time. One raster assigns each cell a label between 0 and 10 000 000 such that labels are contiguous, and another raster assigns each cell a value between 0 and 255. I want to sum the values for each label in an efficient manner, and I think the fastest way is to sort the label row, and for each swap during the sort, apply the same swap in the value row. I want to avoid coding std::sort, std::set_intersection and others by hand.
Upvotes: 1
Views: 749
Reputation: 2973
In fact it is easy to permute the input vectors according to indices
in-place (contrary to the claim in the question):
#include <algorithm>
#include <tuple>
using namespace std;
void aux_sort(vector<uint64_t> & k, vector<char> & v) {
vector<size_t> indices(k.size());
iota(indices.begin(), indices.end(), 0);
sort(indices.begin(), indices.end(),
[&](size_t a, size_t b) { return k[a] < k[b]; });
for (size_t i = 0; i < k.size(); ++i)
while (indices[i] != i) {
swap(k[i], k[indices[i]]);
swap(v[i], v[indices[i]]);
swap(indices[i], indices[indices[i]]);
}
}
However, this solution is perhaps undesirable since it incurs many more cache-faults than the sorting itself as the input is traversed in the order of indices
, which can possibly incur one cache fault per element. On the other hand, quicksort incurs much fewer cache faults (O(n/B log n/M) when pivots are random, where B is the size of a cache line and M is the size of the cache).
Upvotes: 1
Reputation: 275906
Range adapters. The most direct route would be a zip range, that takes two equal length ranges over T and U respectively, and produces a range over pair<T&,U&>
. (containers are a kind of range -- a range that owns its contents)
You then sort this by .first
(or use default sort, where .second
determines ties).
The range is never a container, the wrapping into pairs happens on the fly with each dereference of the zip iterator.
boost
has a zip iterators and zip ranges, but you can write them yourself. The boost iterators/ranges may be read only, but the link also contains an implementation of zipping that is not, and maybe boost has upgraded.
Upvotes: 4
Reputation: 7656
I do not believe that it is possible to satisfy all the constraints that you have set up for the solution. It is almost certainly possible to hack the STL to sort the arrays. However, the solution is likely to be both clumsy and slower than just copying the data, sorting it, and copying it back.
If you have the option, you might want to consider just storing the data in a single vector
to begin with.
Upvotes: 0
Reputation: 10596
You can use the thrust library and use the sort by key function. Not STL, but has the (dubious) advantage of being easily ported to a nVIdia GPU.
Upvotes: 1