rmf
rmf

Reputation: 45

Convert List<char[]> into an Array char[] without using System.arraycopy()

What's a simple way to convert/flatten a List<char[]> to char[] in Java?

I know I can do it by iterating the List and using System.arraycopy, but I'm wondering is there a simpler way to do it using Java 8 streams?

Maybe something like this, but without having to box the primitive char to Character:

List<char[]> listOfCharArrays = ...

Character[] charArray =
    Stream.of(listOfCharArrays )
        .flatMap(List::stream)
        .toArray(Character[]::new);

Upvotes: 14

Views: 1539

Answers (5)

Alexander Ivanchenko
Alexander Ivanchenko

Reputation: 28988

Here's a solution with Stream API which doesn't entail an additional memory allocation, which inevitably happens it you're using String and StringBuilder (because even with Java 8 it's not possible to instantiate a String without making an intermediate copy of the data, and StringBuilder will give you access to it's underlying array instead it gives you copy, and more over since Java 9 both String and StringBuilder are backed by byte[] arrays and not arrays of character).

Firstly, it makes sense to calculate the size of the resulting array (as has been already mentioned by @Maarten Bodewes and @Michael in their answers), which is a pretty fast operation because we are not processing the data of these arrays but only requesting the length of each of them.

And then in order to construct the resulting array we can make use of the collector which accumulates stream elements into the underlying char[] array and then hands it out when all the stream elements has been processes without any intermediate transformations and allocating additional memory.

All functions of the collector need to be stateless and changes should happen only inside its mutable container. Hence, we need a mutable container wraps a char[] array, but it should not have a strong encapsulation like StringBuilder, i.e. allowing access to its underlying array. And we can achieve that with CharBuffer.

So basically below the same idea that was introduced in the answer by @Maarten Bodewes fully implemented with streams.

CharBuffer.allocate(length) under the hood will instantiate char[] of the given length, and CharBuffer.array() will return the same array without generating an additional copy.

public static void main(String[] args) {
    
    List<char[]> listOfCharArrays =
        List.of(new char[]{'a', 'b', 'c'},
                new char[]{'d', 'e', 'f'},
                new char[]{'g', 'h', 'i'});

    char[] charArray = listOfCharArrays.stream()
        .collect(Collectors.collectingAndThen(
            Collectors.summingInt(arr -> arr.length),  // calculating the total length of the arrays in the list
            length -> listOfCharArrays.stream().collect(
                Collector.of(
                    () -> CharBuffer.allocate(length), // mutable container of the collector
                    CharBuffer::put,                   // accumulating stream elements inside the container
                    CharBuffer::put,                   // merging the two containers with partial results (runs only when stream is being executed in parallel)
                    CharBuffer::array                  // finisher function performs the final transformation
                ))
        ));

    System.out.println(Arrays.toString(charArray));
}

Output:

[a, b, c, d, e, f, g, h, i]

Upvotes: 6

Maarten Bodewes
Maarten Bodewes

Reputation: 93978

I can think of only one thing, and that is to use CharBuffer. For efficiency reasons I would always first calculate the right size, and then perform the copy. Any solution that performs multiple copies and/or performs string handling will be inefficient.

Here's the code. The first line calculates the total size of the array required, and then allocates just enough memory for it. The second line performs the copying using the aforementioned put method. The final line returns the char[] that is backing the CharBuffer.

CharBuffer fullBuffer = CharBuffer.allocate(
        listOfCharArrays.stream().mapToInt(array -> array.length).sum());
listOfCharArrays.forEach(fullBuffer::put);
char[] asCharArray = fullBuffer.array();

Of course, I cannot guarantee that it won't use System.arrayCopy somewhere inside of the CharBuffer#put method. I would strongly expect that it will use System.arrayCopy or similar code internally. That probably goes for most solutions provided here though.

It is possible to avoid the first size calculation by using a large enough buffer if you can estimate a maximum size, but it would require an additional copy of the data in the buffer; CharBuffer#array simply returns the correctly sized backing array, which means that the data is copied only once.


You can also use CharBuffer directly if you want to use object oriented code. Beware that you need to make sure that you flip it after writing to it though, and that CharBuffer is mutable (you can pass copies using the duplicate or asReadOnly methods - the returned instances reference the same buffer, but have independent, mutable "position" and "limit" fields).

The Buffer and Java NIO classes are slightly tricky to understand, but once you do you get great benefits from them, e.g. when using them for CharEncoder or memory mapped files.

Upvotes: 16

Joop Eggen
Joop Eggen

Reputation: 109557

It can be done via String or rather a CharBuffer as Holger commented.

char[] flatten(List<char[]> list) {
    return list.stream()
        .map(CharBuffer::wrap) // Better than String::new
        .collect(Collectors.joining())
        .toCharArray();
}

This requires "completed" array without any incomplete surrogate pair of chars at begin or end.

So compare this with:

char[] flatten(List<char[]> list) {
    int totalLength = list.stream().mapToInt(a -> a.length).sum();
    char[] totalArray = new char[totalLength];
    int i = 0;
    for (char[] array : list) {
        System.arraycopy(array, 0, totalArray, i, array.length);
        i += array.length; 
    }
    return totalArray;
}

Not so big a difference, and more solid code.

Or bring the entire software on the immutable String instead of char[].

Upvotes: 7

Michael
Michael

Reputation: 44150

This is the most readable version I can come up with. You can append all the char arrays to a String, via a StringBuilder, then convert that to a char[].

char[] chars = listOfCharArrays.stream()
    .collect(Collector.of(StringBuilder::new, StringBuilder::append, StringBuilder::append, StringBuilder::toString))
    .toCharArray();

Probably much slower than the iterative version, since arrayCopy can copy blocks of memory.

You could consider precomputing the total number of chars to avoid StringBuilder array reallocations, but this optimization and any others are going to eat into the readability gains you're getting from using streams.

int totalSize = listOfCharArrays.stream().mapToInt(arr -> arr.length).sum();
char[] chars = listOfCharArrays.stream()
    .collect(Collector.of(() -> new StringBuilder(totalSize), //... the same

There are 2 unnecessary copies (StringBuilder -> String, String -> char[]) which are effectively a consequence of these classes not being perfectly suited to this task. CharBuffer is better suited; see Maarten's answer.

Upvotes: 18

Youcef LAIDANI
Youcef LAIDANI

Reputation: 59988

Maybe not the best solution, but you can use:

char[] chars = tableRowContainingOnlyRequestedColumns.stream()
        .map(String::valueOf)
        .collect(Collectors.joining())
        .toCharArray();

Upvotes: 4

Related Questions