Clue Less
Clue Less

Reputation: 3165

Picking a random element from a set

How do I pick a random element from a set? I'm particularly interested in picking a random element from a HashSet or a LinkedHashSet, in Java.

Upvotes: 217

Views: 272925

Answers (30)

Basil Bourque
Basil Bourque

Reputation: 340108

Make array of your Set, in modern Java

I like the Answer by Jorge Ferreira. I want to provide a version of that code updated for modern Java. Here we use newer features of Java such as ThreadLocalRandom, Stream & IntStream, method references, and Set.of convenient literals syntax.

You asked:

How do I pick a random element from a set?

  1. Make an array of our Set by calling the inherited method Collection#toArray.
    Understand that we are not duplicating the contained objects. We are merely copying the references held by that Set into the slots of the new array.
  2. Randomly pick an index number (annoying zero-based counting).
  3. Use that index to retrieve an object from our array.

Example code:

final Set < String > names = Set.of ( "Alice" , "Bob" , "Carol" , "Davis" , "Eve" , "Frank" , "Georgette" , "Harry" , "Isabel" , "Jean-Paul" );
String[] array = names.toArray ( String[] :: new );  // Make array of the set, same elements.
int index = ThreadLocalRandom.current ( ).nextInt ( 0 , array.length ); // Randomly choose an index. The `nextInt` method is Half-Open, with beginning inclusive while the ending is exclusive.
String name = array[ index ]; // Access array by random index.

Georgette

If we know for certain the Set is not going change its length during this time (no concurrency), then we can make this code more brief. We can get the size of the Set rather than the length of the array.

And we can fold that index-picking into its following line, for move brevity.

final Set < String > names = Set.of ( "Alice" , "Bob" , "Carol" , "Davis" , "Eve" , "Frank" , "Georgette" , "Harry" , "Isabel" , "Jean-Paul" );
String name = names.toArray ( String[] :: new )[ ThreadLocalRandom.current ( ).nextInt ( 0 , names.size ( ) ) ]; // Access array by random index.

Carol

We may want to retrieve multiple elements randomly.

Instead of a solitary random index number, we generate multiple random index numbers by calling ThreadLocalRandom.current ( ).ints to produce an IntStream (a stream of int integer numbers). We call distinct() to eliminate any duplicates (if that is your goal). We call limit to stop the IntStream after getting our desired number of index numbers. The mapToObj means we want to trade each int number for some object. In our case we want to use each int number as an index into the array, retrieving an object from that array, and feeding that retrieved object into our stream. Lastly we collect those retrieved objects into a List.

final Set < String > names = Set.of ( "Alice" , "Bob" , "Carol" , "Davis" , "Eve" , "Frank" , "Georgette" , "Harry" , "Isabel" , "Jean-Paul" );
String[] array = names.toArray ( String[] :: new );
final int LIMIT = 3;
List < String > randomNames =
        ThreadLocalRandom
                .current ( )
                .ints ( 0 , array.length )
                .distinct ( )
                .limit ( LIMIT )
                .mapToObj ( index -> array[ index ] )
                .toList ( );

[Bob, Georgette, Eve]

Upvotes: 0

Jorge Ferreira
Jorge Ferreira

Reputation: 97967

Make array of your Set

You can easily make an array of the elements contained in your Set (or any Collection). Just call Collection#toArray.

Then retrieve objects by random index numbers.

Set<Integer> set = new LinkedHashSet<Integer>(3);
set.add(1);
set.add(2);
set.add(3);

Random rand = new Random( System.currentTimeMillis() );
int[] setArray = (int[]) set.toArray();
for ( int index = 0; index < 10; ++index ) {
    System.out.println( setArray[rand.nextInt(set.size())] );
}

Upvotes: 10

AkshayKrish
AkshayKrish

Reputation: 207

List<Integer> list=new ArrayList<Integer>(set);
int r=(int)(Math.random()*set.size());
return list.get(r);

Upvotes: 0

Mr.Wizard
Mr.Wizard

Reputation: 24336

In Mathematica:

a = {1, 2, 3, 4, 5}

a[[ ⌈ Length[a] Random[] ⌉ ]]

Or, in recent versions, simply:

RandomChoice[a]

Random[] generates a pseudorandom float between 0 and 1. This is multiplied by the length of the list and then the ceiling function is used to round up to the next integer. This index is then extracted from a.

Since hash table functionality is frequently done with rules in Mathematica, and rules are stored in lists, one might use:

a = {"Badger" -> 5, "Bird" -> 1, "Fox" -> 3, "Frog" -> 2, "Wolf" -> 4};

Upvotes: 0

hub
hub

Reputation: 333

Java 8+ Stream:

    static <E> Optional<E> getRandomElement(Collection<E> collection) {
        return collection
                .stream()
                .skip(ThreadLocalRandom.current()
                .nextInt(collection.size()))
                .findAny();
    }

Based on the answer of Joshua Bone but with slight changes:

  • Ignores the Streams element order for a slight performance increase in parallel operations
  • Uses the current thread's ThreadLocalRandom
  • Accepts any Collection type as input
  • Returns the provided Optional instead of null

Upvotes: 5

Joshua Bone
Joshua Bone

Reputation: 601

In Java 8:

static <E> E getRandomSetElement(Set<E> set) {
    return set.stream().skip(new Random().nextInt(set.size())).findFirst().orElse(null);
}

Upvotes: 47

RKumsher
RKumsher

Reputation: 2897

If you don't mind a 3rd party library, the Utils library has a IterableUtils that has a randomFrom(Iterable iterable) method that will take a Set and return a random element from it

Set<Object> set = new HashSet<>();
set.add(...);
...
Object random = IterableUtils.randomFrom(set);

It is in the Maven Central Repository at:

<dependency>
  <groupId>com.github.rkumsher</groupId>
  <artifactId>utils</artifactId>
  <version>1.3</version>
</dependency>

Upvotes: 0

Thomas Ahle
Thomas Ahle

Reputation: 31624

Unfortunately, this cannot be done efficiently (better than O(n)) in any of the Standard Library set containers.

This is odd, since it is very easy to add a randomized pick function to hash sets as well as binary sets. In a not to sparse hash set, you can try random entries, until you get a hit. For a binary tree, you can choose randomly between the left or right subtree, with a maximum of O(log2) steps. I've implemented a demo of the later below:

import random

class Node:
    def __init__(self, object):
        self.object = object
        self.value = hash(object)
        self.size = 1
        self.a = self.b = None

class RandomSet:
    def __init__(self):
        self.top = None

    def add(self, object):
        """ Add any hashable object to the set.
            Notice: In this simple implementation you shouldn't add two
                    identical items. """
        new = Node(object)
        if not self.top: self.top = new
        else: self._recursiveAdd(self.top, new)
    def _recursiveAdd(self, top, new):
        top.size += 1
        if new.value < top.value:
            if not top.a: top.a = new
            else: self._recursiveAdd(top.a, new)
        else:
            if not top.b: top.b = new
            else: self._recursiveAdd(top.b, new)

    def pickRandom(self):
        """ Pick a random item in O(log2) time.
            Does a maximum of O(log2) calls to random as well. """
        return self._recursivePickRandom(self.top)
    def _recursivePickRandom(self, top):
        r = random.randrange(top.size)
        if r == 0: return top.object
        elif top.a and r <= top.a.size: return self._recursivePickRandom(top.a)
        return self._recursivePickRandom(top.b)

if __name__ == '__main__':
    s = RandomSet()
    for i in [5,3,7,1,4,6,9,2,8,0]:
        s.add(i)

    dists = [0]*10
    for i in xrange(10000):
        dists[s.pickRandom()] += 1
    print dists

I got [995, 975, 971, 995, 1057, 1004, 966, 1052, 984, 1001] as output, so the distribution seams good.

I've struggled with the same problem for myself, and I haven't yet decided weather the performance gain of this more efficient pick is worth the overhead of using a python based collection. I could of course refine it and translate it to C, but that is too much work for me today :)

Upvotes: 0

Jason Hartley
Jason Hartley

Reputation: 2509

This is identical to accepted answer (Khoth), but with the unnecessary size and i variables removed.

    int random = new Random().nextInt(myhashSet.size());
    for(Object obj : myhashSet) {
        if (random-- == 0) {
            return obj;
        }
    }

Though doing away with the two aforementioned variables, the above solution still remains random because we are relying upon random (starting at a randomly selected index) to decrement itself toward 0 over each iteration.

Upvotes: 6

Khoth
Khoth

Reputation: 13328

int size = myHashSet.size();
int item = new Random().nextInt(size); // In real life, the Random object should be rather more shared than this
int i = 0;
for(Object obj : myhashSet)
{
    if (i == item)
        return obj;
    i++;
}

Upvotes: 104

dimo414
dimo414

Reputation: 48874

With Guava we can do a little better than Khoth's answer:

public static E random(Set<E> set) {
  int index = random.nextInt(set.size();
  if (set instanceof ImmutableSet) {
    // ImmutableSet.asList() is O(1), as is .get() on the returned list
    return set.asList().get(index);
  }
  return Iterables.get(set, index);
}

Upvotes: 3

chickeninabiscuit
chickeninabiscuit

Reputation: 9351

A somewhat related Did You Know:

There are useful methods in java.util.Collections for shuffling whole collections: Collections.shuffle(List<?>) and Collections.shuffle(List<?> list, Random rnd).

Upvotes: 77

BHARAT ARYA
BHARAT ARYA

Reputation: 3

If set size is not large then by using Arrays this can be done.

int random;
HashSet someSet;
<Type>[] randData;
random = new Random(System.currentTimeMillis).nextInt(someSet.size());
randData = someSet.toArray();
<Type> sResult = randData[random];

Upvotes: -1

stivlo
stivlo

Reputation: 85546

A generic solution using Khoth's answer as a starting point.

/**
 * @param set a Set in which to look for a random element
 * @param <T> generic type of the Set elements
 * @return a random element in the Set or null if the set is empty
 */
public <T> T randomElement(Set<T> set) {
    int size = set.size();
    int item = random.nextInt(size);
    int i = 0;
    for (T obj : set) {
        if (i == item) {
            return obj;
        }
        i++;
    }
    return null;
}

Upvotes: 0

Nicu Marasoiu
Nicu Marasoiu

Reputation: 11

The easiest with Java 8 is:

outbound.stream().skip(n % outbound.size()).findFirst().get()

where n is a random integer. Of course it is of less performance than that with the for(elem: Col)

Upvotes: 1

Philipp
Philipp

Reputation: 4799

If you really just want to pick "any" object from the Set, without any guarantees on the randomness, the easiest is taking the first returned by the iterator.

    Set<Integer> s = ...
    Iterator<Integer> it = s.iterator();
    if(it.hasNext()){
        Integer i = it.next();
        // i is a "random" object from set
    }

Upvotes: 1

thepace
thepace

Reputation: 2221

Solution above speak in terms of latency but doesn't guarantee equal probability of each index being selected.
If that needs to be considered, try reservoir sampling. http://en.wikipedia.org/wiki/Reservoir_sampling.
Collections.shuffle() (as suggested by few) uses one such algorithm.

Upvotes: 2

Sean Van Gorder
Sean Van Gorder

Reputation: 3453

This is faster than the for-each loop in the accepted answer:

int index = rand.nextInt(set.size());
Iterator<Object> iter = set.iterator();
for (int i = 0; i < index; i++) {
    iter.next();
}
return iter.next();

The for-each construct calls Iterator.hasNext() on every loop, but since index < set.size(), that check is unnecessary overhead. I saw a 10-20% boost in speed, but YMMV. (Also, this compiles without having to add an extra return statement.)

Note that this code (and most other answers) can be applied to any Collection, not just Set. In generic method form:

public static <E> E choice(Collection<? extends E> coll, Random rand) {
    if (coll.size() == 0) {
        return null; // or throw IAE, if you prefer
    }

    int index = rand.nextInt(coll.size());
    if (coll instanceof List) { // optimization
        return ((List<? extends E>) coll).get(index);
    } else {
        Iterator<? extends E> iter = coll.iterator();
        for (int i = 0; i < index; i++) {
            iter.next();
        }
        return iter.next();
    }
}

Upvotes: 35

sivi
sivi

Reputation: 11144

you can also transfer the set to array use array it will probably work on small scale i see the for loop in the most voted answer is O(n) anyway

Object[] arr = set.toArray();

int v = (int) arr[rnd.nextInt(arr.length)];

Upvotes: 0

Thomas Ahle
Thomas Ahle

Reputation: 31624

For fun I wrote a RandomHashSet based on rejection sampling. It's a bit hacky, since HashMap doesn't let us access it's table directly, but it should work just fine.

It doesn't use any extra memory, and lookup time is O(1) amortized. (Because java HashTable is dense).

class RandomHashSet<V> extends AbstractSet<V> {
    private Map<Object,V> map = new HashMap<>();
    public boolean add(V v) {
        return map.put(new WrapKey<V>(v),v) == null;
    }
    @Override
    public Iterator<V> iterator() {
        return new Iterator<V>() {
            RandKey key = new RandKey();
            @Override public boolean hasNext() {
                return true;
            }
            @Override public V next() {
                while (true) {
                    key.next();
                    V v = map.get(key);
                    if (v != null)
                        return v;
                }
            }
            @Override public void remove() {
                throw new NotImplementedException();
            }
        };
    }
    @Override
    public int size() {
        return map.size();
    }
    static class WrapKey<V> {
        private V v;
        WrapKey(V v) {
            this.v = v;
        }
        @Override public int hashCode() {
            return v.hashCode();
        }
        @Override public boolean equals(Object o) {
            if (o instanceof RandKey)
                return true;
            return v.equals(o);
        }
    }
    static class RandKey {
        private Random rand = new Random();
        int key = rand.nextInt();
        public void next() {
            key = rand.nextInt();
        }
        @Override public int hashCode() {
            return key;
        }
        @Override public boolean equals(Object o) {
            return true;
        }
    }
}

Upvotes: 1

Daniel Lubarov
Daniel Lubarov

Reputation: 7924

How about just

public static <A> A getRandomElement(Collection<A> c, Random r) {
  return new ArrayList<A>(c).get(r.nextInt(c.size()));
}

Upvotes: 1

Ben Noland
Ben Noland

Reputation: 34948

List asList = new ArrayList(mySet);
Collections.shuffle(asList);
return asList.get(0);

Upvotes: 7

fandrew
fandrew

Reputation: 419

Fast solution for Java using an ArrayList and a HashMap: [element -> index].

Motivation: I needed a set of items with RandomAccess properties, especially to pick a random item from the set (see pollRandom method). Random navigation in a binary tree is not accurate: trees are not perfectly balanced, which would not lead to a uniform distribution.

public class RandomSet<E> extends AbstractSet<E> {

    List<E> dta = new ArrayList<E>();
    Map<E, Integer> idx = new HashMap<E, Integer>();

    public RandomSet() {
    }

    public RandomSet(Collection<E> items) {
        for (E item : items) {
            idx.put(item, dta.size());
            dta.add(item);
        }
    }

    @Override
    public boolean add(E item) {
        if (idx.containsKey(item)) {
            return false;
        }
        idx.put(item, dta.size());
        dta.add(item);
        return true;
    }

    /**
     * Override element at position <code>id</code> with last element.
     * @param id
     */
    public E removeAt(int id) {
        if (id >= dta.size()) {
            return null;
        }
        E res = dta.get(id);
        idx.remove(res);
        E last = dta.remove(dta.size() - 1);
        // skip filling the hole if last is removed
        if (id < dta.size()) {
            idx.put(last, id);
            dta.set(id, last);
        }
        return res;
    }

    @Override
    public boolean remove(Object item) {
        @SuppressWarnings(value = "element-type-mismatch")
        Integer id = idx.get(item);
        if (id == null) {
            return false;
        }
        removeAt(id);
        return true;
    }

    public E get(int i) {
        return dta.get(i);
    }

    public E pollRandom(Random rnd) {
        if (dta.isEmpty()) {
            return null;
        }
        int id = rnd.nextInt(dta.size());
        return removeAt(id);
    }

    @Override
    public int size() {
        return dta.size();
    }

    @Override
    public Iterator<E> iterator() {
        return dta.iterator();
    }
}

Upvotes: 41

Dan Dyer
Dan Dyer

Reputation: 54495

If you want to do it in Java, you should consider copying the elements into some kind of random-access collection (such as an ArrayList). Because, unless your set is small, accessing the selected element will be expensive (O(n) instead of O(1)). [ed: list copy is also O(n)]

Alternatively, you could look for another Set implementation that more closely matches your requirements. The ListOrderedSet from Commons Collections looks promising.

Upvotes: 16

Aaron McDaid
Aaron McDaid

Reputation: 27183

C++. This should be reasonably quick, as it doesn't require iterating over the whole set, or sorting it. This should work out of the box with most modern compilers, assuming they support tr1. If not, you may need to use Boost.

The Boost docs are helpful here to explain this, even if you don't use Boost.

The trick is to make use of the fact that the data has been divided into buckets, and to quickly identify a randomly chosen bucket (with the appropriate probability).

//#include <boost/unordered_set.hpp>  
//using namespace boost;
#include <tr1/unordered_set>
using namespace std::tr1;
#include <iostream>
#include <stdlib.h>
#include <assert.h>
using namespace std;

int main() {
  unordered_set<int> u;
  u.max_load_factor(40);
  for (int i=0; i<40; i++) {
    u.insert(i);
    cout << ' ' << i;
  }
  cout << endl;
  cout << "Number of buckets: " << u.bucket_count() << endl;

  for(size_t b=0; b<u.bucket_count(); b++)
    cout << "Bucket " << b << " has " << u.bucket_size(b) << " elements. " << endl;

  for(size_t i=0; i<20; i++) {
    size_t x = rand() % u.size();
    cout << "we'll quickly get the " << x << "th item in the unordered set. ";
    size_t b;
    for(b=0; b<u.bucket_count(); b++) {
      if(x < u.bucket_size(b)) {
        break;
      } else
        x -= u.bucket_size(b);
    }
    cout << "it'll be in the " << b << "th bucket at offset " << x << ". ";
    unordered_set<int>::const_local_iterator l = u.begin(b);
    while(x>0) {
      l++;
      assert(l!=u.end(b));
      x--;
    }
    cout << "random item is " << *l << ". ";
    cout << endl;
  }
}

Upvotes: 1

pjb3
pjb3

Reputation: 5283

Clojure solution:

(defn pick-random [set] (let [sq (seq set)] (nth sq (rand-int (count sq)))))

Upvotes: 3

inglesp
inglesp

Reputation: 3357

In lisp

(defun pick-random (set)
       (nth (random (length set)) set))

Upvotes: 0

Mathew Byrne
Mathew Byrne

Reputation: 3759

Javascript solution ;)

function choose (set) {
    return set[Math.floor(Math.random() * set.length)];
}

var set  = [1, 2, 3, 4], rand = choose (set);

Or alternatively:

Array.prototype.choose = function () {
    return this[Math.floor(Math.random() * this.length)];
};

[1, 2, 3, 4].choose();

Upvotes: 0

Mitch Wheat
Mitch Wheat

Reputation: 300769

In C#

        Random random = new Random((int)DateTime.Now.Ticks);

        OrderedDictionary od = new OrderedDictionary();

        od.Add("abc", 1);
        od.Add("def", 2);
        od.Add("ghi", 3);
        od.Add("jkl", 4);


        int randomIndex = random.Next(od.Count);

        Console.WriteLine(od[randomIndex]);

        // Can access via index or key value:
        Console.WriteLine(od[1]);
        Console.WriteLine(od["def"]);

Upvotes: 0

Hugh Allen
Hugh Allen

Reputation: 6695

Icon has a set type and a random-element operator, unary "?", so the expression

? set( [1, 2, 3, 4, 5] )

will produce a random number between 1 and 5.

The random seed is initialized to 0 when a program is run, so to produce different results on each run use randomize()

Upvotes: 0

Related Questions