Reputation: 1519
I'm curious as to why the STL container unordered_set
, which has constant time complexity for random access on average, does not provide a method for accessing elements by some distance from the first element in the container. For example:
T& unordered_set::operator[](size_t index)
{
return *(begin() + index);
}
Upvotes: 6
Views: 7094
Reputation: 106096
unordered_set
is implemented as a hash table, in which there are randomly-accessible "buckets" each of which effectively contains a linked list of 0 or more elements. So, a snapshot of an unordered_set
storing the numbers 1 through 7 might happen to look something like this (the exact positioning of elements depends on the hash function used, so this is just illustrative):
buckets linked-list of elements
[0] 1 --> 5 --> nullptr
[1] nullptr
[2] 4 --> nullptr
[3] nullptr
[4] nullptr
[5] 7 --> nullptr
[6] 6 --> 3 --> 2 --> nullptr
[7] nullptr
As you can see, there's no easy way to advance n
elements... you basically have to follow linked lists, skipping to the next bucket when you find a nullptr
. That's why the begin()
operation can't return a Random Access Iterator with O(1) times for + n
movement (it only provides a Forward Iterator)....
So when you ask...
unordered_set
, which has constant time complexity for random access on average
...I think you're confusing random access by key with random access by index. You can find any given key in O(1) amortised constant time, but finding the nth element is O(n).
(Note: the C++11 Standard does not leave implementations the freedom to choose closed hashing (aka open addressing) to implement unordered_set
... that's evident from the max_load_factor
after construction being required to be 1.0
, and the rule that during insert
/emplace
iterator invalidation can only happen when the max_load_factor
is exceeded.)
Upvotes: 6
Reputation: 145269
In order to provide constant amortized time for item membership checking, which is the advantage of an unordered set, for the general case of some arbitrary item type it has to be implemented as a hash table.
It's not difficult to ensure, in constant time, that every key (item) in a hash table is also referred to by a node of a linked list. This provides a means to iterate over the items in the table, in unspecified order. However, going to the i-th item in a linked list is linear time.
There is a compromise solution where as each item is added to the hash table, it's also added to a sorted tree. This requires the items to be comparable, and it increases the add and remove complexity to logarithmic (keeping constant time for checking). But while this supports access of the i-th item in logarithmic time, which item is the i-th one will vary, and there is really no big demand for this functionality.
The clincher is that C++11 requires O(1) average time for inserts in an unordered container, which is incompatible with the ordered tree.
So, since direct indexing is impractical (linear time) and not in demand, it's not offered, but you can always use *std::next( s.begin(), i )
as a linear time alternative to a hypothetical s[i]
. In principle you can optimize that for a set that doesn't change, by copying it to a std::vector
. But in most cases it will be better to use iterators.
Upvotes: 1
Reputation: 726569
Accessing an element "by some distance" implies that there is some meaningful way to measure that distance. The trouble with std::unordered_set
is that is is, well, unordered. Hence, there is no meaningful way of explaining "some distance from the beginning" in a non-arbitrary way.
If you want to access by distance, copy the data into a vector:
std::vector tmp(unordered.begin(), unordered.end());
Upvotes: 6
Reputation: 308158
unordered_set
is unordered by definition, so accessing it by index isn't terribly useful. The index of any particular element may change any time you make an insertion.
Also, according to this reference, the iterator for an unordered_set
is a forward iterator, not random access.
Upvotes: 2