aekber
aekber

Reputation: 531

How to get a random element from a list with stream api?

What is the most effective way to get a random element from a list with Java8 stream api?

Arrays.asList(new Obj1(), new Obj2(), new Obj3());

Thanks.

Upvotes: 34

Views: 43229

Answers (10)

OscarRyz
OscarRyz

Reputation: 199284

Use: Collectors.collectingAndThen

Most of the answers here assume you know the list size ahead of time. If you don't and you only know until you're processing the list with a stream, you can use Collectors.collectingAndThen

 Random random = new Random();
 Arrays.asList(1,2,3)
   .stream()
   .collect(
       Collectors.collectingAndThen(
           Collectors.toList(), 
           l -> l.get(random.nextInt(l.size()))
       )
   );

Upvotes: 0

Atakan Yıldırım
Atakan Yıldırım

Reputation: 902

Sometimes you may want to get a random item somewhere in the stream. If you want to get random items even after filtering your list, this code snippet will work for you:

List<String> items = Arrays.asList("A", "B", "C", "D", "E");
List<String> shuffledAndFilteredItems = items.stream()
        .filter(value -> value.equals("A") || value.equals("B"))
        //filter, map...
        .collect(Collectors.collectingAndThen(
                Collectors.toCollection(ArrayList::new),
                list -> {
                    Collections.shuffle(list);
                    return list;
                }));
String randomItem = shuffledAndFilteredItems
        .stream()
        .findFirst()
        .orElse(null);

Of course there may be faster / optimized ways, but it allows you to do it all at once.

Upvotes: 1

Jean-Baptiste Yun&#232;s
Jean-Baptiste Yun&#232;s

Reputation: 36431

Why with streams? You just have to get a random number from 0 to the size of the list and then call get on this index:

Random r = new Random();
ElementType e = list.get(r.nextInt(list.size()));

Stream will give you nothing interesting here, but you can try with:

Random r = new Random();
ElementType e = list.stream().skip(r.nextInt(list.size())).findFirst().get();

Idea is to skip an arbitrary number of elements (but not the last one!), then get the first element if it exists. As a result you will have an Optional<ElementType> which will be non empty and then extract its value with get. You have a lot of options here after having skip.

Using streams here is highly inefficient...

Note: that none of these solutions take in account empty lists, but the problem is defined on non-empty lists.

Upvotes: 44

adrianosymphony
adrianosymphony

Reputation: 151

In the last time I needed to do something like that I did that:

List<String> list = Arrays.asList("a", "b", "c");
Collections.shuffle(list);
String letter = list.stream().findAny().orElse(null);
System.out.println(letter);

Upvotes: 5

Chris A
Chris A

Reputation: 71

If you don't know in advance the size of the your list, you could do something like that :

 yourStream.collect(new RandomListCollector<>(randomSetSize));

I guess that you will have to write your own Collector implementation like this one to have an homogeneous randomization :

 public class RandomListCollector<T> implements Collector<T, RandomListCollector.ListAccumulator<T>, List<T>> {

private final Random rand;
private final int size;

public RandomListCollector(Random random , int size) {
    super();
    this.rand = random;
    this.size = size;
}

public RandomListCollector(int size) {
    this(new Random(System.nanoTime()), size);
}

@Override
public Supplier<ListAccumulator<T>> supplier() {
    return () -> new ListAccumulator<T>();
}

@Override
public BiConsumer<ListAccumulator<T>, T> accumulator() {
    return (l, t) -> {
        if (l.size() < size) {
            l.add(t);
        } else if (rand.nextDouble() <= ((double) size) / (l.gSize() + 1)) {
            l.add(t);
            l.remove(rand.nextInt(size));
        } else {
            // in any case gSize needs to be incremented
            l.gSizeInc();
        }
    };

}

@Override
public BinaryOperator<ListAccumulator<T>> combiner() {
    return (l1, l2) -> {
        int lgSize = l1.gSize() + l2.gSize();
        ListAccumulator<T> l = new ListAccumulator<>();
        if (l1.size() + l2.size()<size) {
            l.addAll(l1);
            l.addAll(l2);
        } else {
            while (l.size() < size) {
                if (l1.size()==0 || l2.size()>0 && rand.nextDouble() < (double) l2.gSize() / (l1.gSize() + l2.gSize())) {
                    l.add(l2.remove(rand.nextInt(l2.size()), true));
                } else {
                    l.add(l1.remove(rand.nextInt(l1.size()), true));
                }
            }
        }
        // set the gSize of l :
        l.gSize(lgSize);
        return l;

    };
}

@Override
public Function<ListAccumulator<T>, List<T>> finisher() {

    return (la) -> la.list;
}

@Override
public Set<Characteristics> characteristics() {
    return Collections.singleton(Characteristics.CONCURRENT);
}

static class ListAccumulator<T> implements Iterable<T> {
    List<T> list;
    volatile int gSize;

    public ListAccumulator() {
        list = new ArrayList<>();
        gSize = 0;
    }

    public void addAll(ListAccumulator<T> l) {
        list.addAll(l.list);
        gSize += l.gSize;

    }

    public T remove(int index) {
        return remove(index, false);
    }

    public T remove(int index, boolean global) {
        T t = list.remove(index);
        if (t != null && global)
            gSize--;
        return t;
    }

    public void add(T t) {
        list.add(t);
        gSize++;

    }

    public int gSize() {
        return gSize;
    }

    public void gSize(int gSize) {
        this.gSize = gSize;

    }

    public void gSizeInc() {
        gSize++;
    }

    public int size() {
        return list.size();
    }

    @Override
    public Iterator<T> iterator() {
        return list.iterator();
    }
}

}

If you want something easier and still don't want to load all your list in memory:

public <T> Stream<T> getRandomStreamSubset(Stream<T> stream, int subsetSize) {
    int cnt = 0;

    Random r = new Random(System.nanoTime());
    Object[] tArr = new Object[subsetSize];
    Iterator<T> iter = stream.iterator();
    while (iter.hasNext() && cnt <subsetSize) {
        tArr[cnt++] = iter.next();          
    }

    while (iter.hasNext()) {
        cnt++;
        T t = iter.next();
        if (r.nextDouble() <= (double) subsetSize / cnt) {
            tArr[r.nextInt(subsetSize)] = t;                

        }

    }

    return Arrays.stream(tArr).map(o -> (T)o );
}

but you are then away from the stream api and could do the same with a basic iterator

Upvotes: 1

Grzegorz Piwowarek
Grzegorz Piwowarek

Reputation: 13823

Another idea would be to implement your own Spliterator and then use it as a source for Stream:

import java.util.List;
import java.util.Random;
import java.util.Spliterator;
import java.util.function.Consumer;
import java.util.function.Supplier;

public class ImprovedRandomSpliterator<T> implements Spliterator<T> {

    private final Random random;
    private final T[] source;
    private int size;

    ImprovedRandomSpliterator(List<T> source, Supplier<? extends Random> random) {
        if (source.isEmpty()) {
            throw new IllegalArgumentException("RandomSpliterator can't be initialized with an empty collection");
        }
        this.source = (T[]) source.toArray();
        this.random = random.get();
        this.size = this.source.length;
    }

    @Override
    public boolean tryAdvance(Consumer<? super T> action) {
        if (size > 0) {
            int nextIdx = random.nextInt(size); 
            int lastIdx = size - 1; 

            action.accept(source[nextIdx]); 
            source[nextIdx] = source[lastIdx]; 
            source[lastIdx] = null; // let object be GCed 
            size--;
            return true;
        } else {
            return false;
        }
    }

    @Override
    public Spliterator<T> trySplit() {
        return null;
    }

    @Override
    public long estimateSize() {
        return source.length;
    }

    @Override
    public int characteristics() {
        return SIZED;
    }
}


public static <T> Collector<T, ?, Stream<T>> toShuffledStream() {
    return Collectors.collectingAndThen(
      toCollection(ArrayList::new),
      list -> !list.isEmpty()
        ? StreamSupport.stream(new ImprovedRandomSpliterator<>(list, Random::new), false)
        : Stream.empty());
}

and then simply:

list.stream()
  .collect(toShuffledStream())
  .findAny();

Details can be found here.

...but it's definitely an overkill, so if you're looking for a pragmatic approach. Definitely go for Jean's solution.

Upvotes: 1

Mikusch
Mikusch

Reputation: 135

While all the given answers work, there is a simple one-liner that does the trick without having to check if the list is empty first:

List<String> list = List.of("a", "b", "c");
list.stream().skip((int) (list.size() * Math.random())).findAny();

For an empty list this will return an Optional.empty.

Upvotes: 8

Tarandrus
Tarandrus

Reputation: 1

The selected answer has errors in its stream solution... You cannot use Random#nextInt with a non-positive long, "0" in this case. The stream solution will also never choose the last in the list Example:

List<Integer> intList = Arrays.asList(0, 1, 2, 3, 4);

// #nextInt is exclusive, so here it means a returned value of 0-3
// if you have a list of size = 1, #next Int will throw an IllegalArgumentException (bound must be positive)
int skipIndex = new Random().nextInt(intList.size()-1);

// randomInt will only ever be 0, 1, 2, or 3. Never 4
int randomInt = intList.stream()
                       .skip(skipIndex) // max skip of list#size - 2
                       .findFirst()
                       .get();

My recommendation would be to go with the non-stream approach that Jean-Baptiste Yunès put forth, but if you must do a stream approach, you could do something like this (but it's a little ugly):

list.stream()
    .skip(list.isEmpty ? 0 : new Random().nextInt(list.size()))
    .findFirst();

Upvotes: 0

RichardK
RichardK

Reputation: 3471

There are much more efficient ways to do it, but if this has to be Stream the easiest way is to create your own Comparator, which returns random result (-1, 0, 1) and sort your stream:

 List<String> strings = Arrays.asList("a", "b", "c", "d", "e", "f");
    String randomString = strings
            .stream()
            .sorted((o1, o2) -> ThreadLocalRandom.current().nextInt(-1, 2))
            .findAny()
            .get();

ThreadLocalRandom has ready "out of the box" method to get random number in your required range for comparator.

Upvotes: 7

KidCrippler
KidCrippler

Reputation: 1723

If you HAVE to use streams, I wrote an elegant, albeit very inefficient collector that does the job:

/**
 * Returns a random item from the stream (or null in case of an empty stream).
 * This operation can't be lazy and is inefficient, and therefore shouldn't
 * be used on streams with a large number or items or in performance critical sections.
 * @return a random item from the stream or null if the stream is empty.
 */
public static <T> Collector<T, List<T>, T> randomItem() {
    final Random RANDOM = new Random();
    return Collector.of(() -> (List<T>) new ArrayList<T>(), 
                              (acc, elem) -> acc.add(elem),
                              (list1, list2) -> ListUtils.union(list1, list2), // Using a 3rd party for list union, could be done "purely"
                              list -> list.isEmpty() ? null : list.get(RANDOM.nextInt(list.size())));
}

Usage:

@Test
public void standardRandomTest() {
    assertThat(Stream.of(1, 2, 3, 4).collect(randomItem())).isBetween(1, 4);
}

Upvotes: 2

Related Questions