Basil Bourque
Basil Bourque

Reputation: 339837

No result when filtering on a random IntStream, in Java

I am representing a collection of turtles, each with a moment when hatched. I want to randomly choose any mature turtle.

I pretend that minutes are years, in this aquarium simulation.

record Turtle(
        String name ,
        Instant hatched ,
        Duration lifespan
)
{
    static final Duration MATURITY = Duration.ofMinutes ( 4 );

    Duration age ( ) { return Duration.between ( this.hatched , Instant.now ( ) ); }
}

I use ThreadLocalRandom to generate an IntStream of random integers. I use those integers as an index into my List of Turtle objects. I check the age of that turtle, to see if it exceeds the predefined MATURITY duration. If not mature, let the stream continue.

Of course, no turtles may yet be mature. So I defined the return type as Optional < Turtle >, to be empty if no turtle found.

The problem is that my stream seems to fail, with no result ever returned. Why?

public class Aquarium
{
    public static void main ( String[] args )
    {
        System.out.println ( "INFO Demo start. " + Instant.now ( ) );

        // Sample data.
        List < Turtle > turtles =
                List.of (
                        new Turtle ( "Alice" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 3 ) ) , Duration.ofMinutes ( 17 ) ) ,
                        new Turtle ( "Bob" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 2 ) ) , Duration.ofMinutes ( 16 ) ) ,
                        new Turtle ( "Carol" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 1 ) ) , Duration.ofMinutes ( 18 ) ) ,
                        new Turtle ( "Davis" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 2 ) ) , Duration.ofMinutes ( 22 ) )
                );
        System.out.println ( "turtles = " + turtles );

        // Logic
        Optional < Turtle > anArbitraryMatureTurtle =
                ThreadLocalRandom
                        .current ( )
                        .ints ( 0 , turtles.size ( ) )
                        .filter (
                                ( int randomIndex ) -> turtles.get ( randomIndex ).age ( ).compareTo ( Turtle.MATURITY ) > 0
                        )
                        .mapToObj ( turtles :: get )
                        .findAny ( );
        System.out.println ( "anArbitraryMatureTurtle = " + anArbitraryMatureTurtle );

        // Wrap-up
        try { Thread.sleep ( Duration.ofMinutes ( 30 ) ); } catch ( InterruptedException e ) { throw new RuntimeException ( e ); } // Let the aquarium run a while.
        System.out.println ( "INFO Demo end. " + Instant.now ( ) );
    }
}

Upvotes: 2

Views: 148

Answers (3)

WJS
WJS

Reputation: 40047

"Shuffling an entire collection is likely not as efficient as randomly-picking an index."

Perhaps I missed something but I disagree. Consider that when you generate a list of size n distinct random numbers, the further it progresses toward the limit, the more misses it will encounter to get those last few distinct values. And the larger the value of n, the worse it gets.

But shuffling a list of n digits should always be O(n)

Below is my shuffle routine. It shuffles in place and returns the iterations (which should be obvious but I returned the value anyway). It uses a modified Fisher-Yates algorithm. I am presuming the Collections API version is as good if not better.

Shuffling plus initial array generation always takes 2*n iterations.

    public static int shuffle(int[] array) {
        Random r = new Random();
        int iterations = 0;
        for (int i = array.length; i > 0; i--) {
            int pos = r.nextInt(i);
            int temp = array[i-1];
            array[i-1] = array[pos];
            array[pos] = temp;
            iterations++;
        }
        return iterations;
    }

The following further modifies the shuffle to randomize the values from 0 to turtles.size, placing each on the stream as it is encountered.

IntStream.of(turtles.size()).mapMulti(Aquarium::randomize).
    // rest of stream here
    
  
public static void randomize(int arraySize, IntConsumer con) {
    ThreadLocalRandom r = ThreadLocalRandom.current();
    int[] array = IntStream.range(0, arraySize).toArray();
    while (arraySize > 0) {
        int pos = r.nextInt(arraySize);
        int temp = array[--arraySize];
        con.accept(array[pos]);
        array[pos] = temp;
    }
}

Upvotes: 0

Basil Bourque
Basil Bourque

Reputation: 339837

Infinite Stream can be infinite loop

You inadvertently made an infinite loop of your infinite IntStream.

Your IntStream created by ThreadLocalRandom.ints is infinite, continually generating one random number after another, within your defined range. Duplicates may occur, as the random number keeps on generating until your stream reaches a terminal operation.

You called findAny to terminate your stream. But look at your sample data. All of your turtles are young, where none of them have an age meeting your IntPredicate test of exceeding the MATURITY duration. So none of your turtles ever pass the age test. So the findAny never finds any Turtle, so it never returns any Optional < Turtle >.

This means the random number generating keeps grinding away, producing one index number after another. For each random number, we access the List of Turtle objects. But none pass our predicate test. Therefore, our stream never terminates. This random-generator -> List#get -> predicate test routine continues endlessly. The stream never terminates, in this endless loop, giving the appearance of nothing happening. In actuality, a lot is happening, with a very busy CPU core generating numbers, accessing the list, and calculating & comparing durations, endlessly.

We can demonstrate this by changing your sample data. Start the Bob & Davis fishes with hatch-dates of 7 minutes ago rather than 2 minutes. Run your code again to see that either of them is randomly picked, and picked quite quickly.

        List < Turtle > turtles =
                List.of (
                        new Turtle ( "Alice" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 3 ) ) , Duration.ofMinutes ( 17 ) ) ,
                        new Turtle ( "Bob" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 7 ) ) , Duration.ofMinutes ( 16 ) ) ,
                        new Turtle ( "Carol" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 1 ) ) , Duration.ofMinutes ( 18 ) ) ,
                        new Turtle ( "Davis" , Instant.now ( ).truncatedTo ( ChronoUnit.MINUTES ).minus ( Duration.ofMinutes ( 7 ) ) , Duration.ofMinutes ( 22 ) )
                );

anArbitraryMatureTurtle = Optional[Turtle[name=Davis, hatched=2024-12-23T21:17:00Z, lifespan=PT22M]]

Solution

There is no solution to directly fixing your current code. You cannot use an infinite stream without a guaranteed termination.

Instead, you must rewrite your code.

One rewrite could be making a copy of your List of Turtle objects. Shuffle them, putting the list in random order. Then stream that randomized list until one is found meeting the condition(s) of your predicate test. If none meet those conditions, the stream ends after exhaustively checking every element, and an empty Optional will be returned.

Change your Logic section to this:

        // Logic
        List < Turtle > turtlesShuffled = new ArrayList <> ( turtles );
        Collections.shuffle ( turtlesShuffled );
        Optional < Turtle > anArbitraryMatureTurtle =
                turtlesShuffled
                        .stream ( )
                        .filter (
                                turtle -> turtle.age ( ).compareTo ( Turtle.MATURITY ) > 0
                        )
                        .findAny ( );

No more random index.

When run with all young immature turtles:

anArbitraryMatureTurtle = Optional.empty

That same logic could be done with a conventional for loop rather than a Stream. I find the Stream syntax to be more expressive in this particular case.

        // Logic
        List < Turtle > turtlesShuffled = new ArrayList <> ( turtles );
        Collections.shuffle ( turtlesShuffled );
        Optional < Turtle > anArbitraryMatureTurtle = Optional.empty ( );
        for ( Turtle turtle : turtlesShuffled )
        {
            if ( turtle.age ( ).compareTo ( Turtle.MATURITY ) > 0 )
            {
                anArbitraryMatureTurtle = Optional.of ( turtle );
                break;
            }
        }

Shuffling an entire collection is likely not as efficient as randomly-picking an index. That may have been your original motivation in choosing the random-index approach. But given your possible data values, the random-index approach is not feasible.


Wrong… there is a solution fixing that stream of random index numbers approach. See correct Answer by tquadrat.

Upvotes: 1

tquadrat
tquadrat

Reputation: 4044

Try this for the Logic section in your main() method, adding calls to Stream#distinct and Stream#limit. This way we cover all possible index values once, in a random order.

…
// Logic
Optional<Turtle> anArbitraryMatureTurtle = ThreadLocalRandom.current()
  .ints( 0, turtles.size() )
  .distinct()
  .limit( turtles.size() )
  .filter( (int randomIndex) -> turtles.get( randomIndex ).age().compareTo( Turtle.MATURITY ) > 0 )
  .mapToObj( turtles::get )
  .findAny();
System.out.println( "anArbitraryMatureTurtle = " + anArbitraryMatureTurtle );
…

If no Turtle passes the test, we get an empty Optional.

The code sequence:

ThreadLocalRandom.current()
  .ints( 0, 5 )
  .distinct()
  .limit( 5 )
  .forEach( System.out::println );

… prints the five values 0 to 4 in a random order – always five, no duplicates. Of course you have to use turtles.size() for your use case instead of the constant 5.

I have some doubts that this is the most efficient way to get a random mature turtle from your list (given that there is one), but it works.

Upvotes: 1

Related Questions