VinS
VinS

Reputation: 194

find element in linked list

There is a data structure providing iterators with the following interface:

struct iterator {
    T value();
    void next();
    bool isValid();
}

How would you design an algorithm which at the end of the loop returns some value from the list with equal probability for each element? The list can be very long, so length of the list cannot be represented by int or long. The list cannot be modified.

Any ideas? Thanks.

Upvotes: 0

Views: 63

Answers (1)

hyde
hyde

Reputation: 62797

You don't know the length of the list, so you have to iterate it all the way to the end. You need to keep copy of the currently chosen items value (or item, but there seems to be no way to do that in your question). Then for every new item you iterate to, you need to determine if the chosen item should be kept or changed, with decreasing probability.

Since you state that length of list may not fit in a native/primitive data type (I assume you mean that when you talk about int and long, which are programming language and even dialect specific data types, and you don't specify programming language), I take it you mean the list may be arbitrarily long. So you need a bignum library, which can give you arbitrary random numbers.

Pseudocode:

T current = empty value
bignum index = 0
iterator = first item of list
while iterator.isValid()
    index = index + 1
    if bignum_random_below(index) == 0
        # 0 is interpreted as, take value from current index,
        # everything else is interpreted as keep the previous value
        current  = iterator.value()
    end if
    iterator.next()
end while
# index value 0 indicates empty list, even if T doesn't have empty value

From comment of M Oehm: this is called reservoir sampling.

Upvotes: 3

Related Questions