the_kaba
the_kaba

Reputation: 1567

An elegant way to specify initial capacity of Collector in java stream api

I've tried to find a good way to set up initial capacity of collector in java stream api. The simplest example is there:

data.stream()
        .collect(Collectors.toList());

I just want to pass an int with size of list into collector in order not to resize internal array. The first intention is to do it in such way:

data.stream()
        .collect(Collectors.toList(data.size()));

But unfortunately toList isn't overloaded to work with parameter. I found one solution but it smells:

 data.stream()
        .collect(Collectors.toCollection(() -> new ArrayList<>(data.size())));

Is there any way to express it simplier?

Upvotes: 9

Views: 3943

Answers (3)

theKsoni
theKsoni

Reputation: 11

data.stream().collect(Collectors.toCollection(() -> new HashSet<>(100)))
data.stream().collect(Collectors.collectingAndThen(
                        Collectors.toCollection(() -> new HashSet<>(100)), Collections::unmodifiableSet))

Upvotes: 1

Mz A
Mz A

Reputation: 1089

I'd take your inelegant

Collectors.toCollection(() -> new ArrayList<>(data.size()))

and wrap it in a static method

public static <T> Collector<T, ?, List<T>> toList(int size) {
    return Collectors.toCollection(() -> new ArrayList<T>(size));
}

then call it (with a static import)

stream.collect(toList(size))

!inelegant?

edit (This does make it an ArrayList) is this bad?

Upvotes: 4

Lachezar Balev
Lachezar Balev

Reputation: 12041

I do not know of any straightforward way in the API to ensure the capacity of the mutable container used under the hood to collect the data. I may guess that at least one of the many reasons is the support for parallelism by calling parallelStream().

So - if your data is processed in parallel there is no much sense to give initial capacity even if you know that the underlying container (e.g. ArrayList) supports capacity. Multiple containers will be created by different threads and later combined and the capacity will at least harm the overall performance.

If you want to be truly specific and elegant you may also try to implement your own collector. It is not difficult.

Upvotes: 1

Related Questions