rtpax
rtpax

Reputation: 1777

Randomly select an element from a std::set in O(log(n)) time

Is there a way to randomly select an element from an std::set in O(log(n)) or even better O(1) time? It doesn't need to have an extremely even distribution, just something fairly random (though obviously even is better)

Looking at std::set::extract seems like it might be promising since it could potentially divide the set in half in constant time, but I can't find a good way to identify the nodes close to the root, and the node_type documentation I could find doesn't go into much detail.

If all else fails, the contents could be copied to a std::map with random keys in what I think would be O(n log(n)) time, which would get me amortized O(log(n)) time, but that is not the preferred solution since it would require some overhead in the cases where I don't want everything

Upvotes: 1

Views: 460

Answers (2)

eerorika
eerorika

Reputation: 238301

If the elements themselves within the set are evenly distributed within some domain of values, then you could generate a random value within that domain, and use std::set::lower_bound to get the first element contained in the set that is not less than the random value.

Given that you don't need an extremely even distribution, the evenness requirement of the elements within the set might not be extremely necessary. The likelihood of an element to be chosen depends on its comparative distance from its predecessor element.

For an even distribution, I don't think there's better than *std::next(std::begin(s), random_index), which is linear in complexity.


For a good general solution with even distribution, and logarithmic asymptotic complexity, you need another data structure than std::set.

In particular, one good choice is the Order statistic tree which augments a search tree by adding into a node the size of its subtree. OST has a Select(i) operation, which is similar to array subscript operation, and you can pick a random element between index 0...N in the same way.

Another option is to use a sorted array. The sorted property can be used to keep the logarithmic lookup property that std::set has.

Unfortunately though, the standard library has neither order statistic tree, nor a sorted-array-set.

Upvotes: 2

Gold
Gold

Reputation: 136

std::set complexity is o(log(n)) so you can't lower search complexity bellow that one using std::set alone. You may use use an indexing structure like a vector to achieve that.

Besides that you cant achieve a random seach using this code :

 std::random_device              rd;
 std::mt19937                    gen(rd());
 std::uniform_int_distribution<> dis( 0, set::size ( ) );

then

 set::operator [] (dis(gen));

or something appoching

Upvotes: 0

Related Questions