gstackoverflow
gstackoverflow

Reputation: 36984

How to improve pervormance of String processing using Stream#reduce?

In my legacy project this code executes more than one minute for 10000 elements

private ByteArrayInputStream getInputStreamFromContactFile(MyDTO contacts) {
        long start = System.currentTimeMillis();
        try {
            byte[] bytes = contacts.getLines()
                    .stream()
                    .map(lineItem -> lineItem.value)
                    .reduce(contacts.getHeader().concat("\n"), (partialString, el) -> partialString + el+ '\n')
                    .getBytes();
            return new ByteArrayInputStream(bytes);
        } finally {
            log.info("Duration is {}ms", System.currentTimeMillis() - start);
        }

Is there any obvious way to make it faster ?

Upvotes: 0

Views: 103

Answers (3)

Nowhere Man
Nowhere Man

Reputation: 19545

In order to improve performance it may be better to use intermediate ByteArrayOutputStream + OutputStreamWriter to concatenate the values.

The byte array of the concatenated result is returned by ByteArrayOutputStream::toByteArray

private ByteArrayInputStream getInputStreamFromContactFile(MyDTO contacts) throws IOException {
    long start = System.currentTimeMillis();
    try {
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        Writer writer = new OutputStreamWriter(bos);
        writer.write(contacts.getHeader());
        writer.write("\n");

        contacts.getLines().forEach(line -> { 
            try {
                writer.write(line.value);
                writer.write("\n");
            } catch (IOException ioex) { throw new RuntimeException(ioex);}
        });
        writer.flush();

        return new ByteArrayInputStream(bos.toByteArray());
    } finally {
        log.info("Duration is {}ms", System.currentTimeMillis() - start);
    }
}

Another approach could be to use Collectors.joining with prefix and suffix:

private ByteArrayInputStream getInputStreamFromContactFile(MyDTO contacts) {
    long start = System.currentTimeMillis();
    try {
        return new ByteArrayInputStream(
            contacts.getLines()
                .stream()
                .map(item -> item.value)
                .collect(Collectors.joining("\n", contacts.getHeader().concat("\n"), "\n"))
                .getBytes()
        );
    } finally {
        log.info("Duration is {}ms", System.currentTimeMillis() - start);
    }
}

If it is really necessary to use Stream::reduce operation (due to some reason) with StringBuilder, the following approach may be applied:

private static ByteArrayInputStream getInputStreamFromContactFileReducing(MyDTO contacts) {

    long start = System.currentTimeMillis();
    try {
        byte[] bytes = contacts.getLines()
                               .stream()
                               .map(lineItem -> lineItem.value)
                               .reduce(new StringBuilder().append(contacts.getHeader()).append("\n"),
                                       (sb, line) -> sb.append(line).append('\n'),
                                       (sb1, sb2) -> sb1.append(sb2))
                               .toString()
                               .getBytes();
        return new ByteArrayInputStream(bytes);
    } finally {
        log.info("Reducing: Duration is {}ms", System.currentTimeMillis() - start);
    }
}

Upvotes: 1

Leonard Brünings
Leonard Brünings

Reputation: 13222

If speed is really important using a StringBuilder will help, but look a bit less functional.

StringBuilder builder = new StringBuilder();
builder.append(contacts.getHeader());
builder.append("\n");
contacts.getLines()
    .stream()
    .map(lineItem -> lineItem.value)
    .forEach(line -> {
      builder.append(line);
      builder.append("\n");
    });
builder.toString().getBytes();

Upvotes: 1

WJS
WJS

Reputation: 40034

Well, for over 10_000_000 lines of 36 characters each, this ran in under 4 seconds. Not certain if it does what you want though.

private ByteArrayInputStream getInputStreamFromContactFile(MyDTO contacts) {
    long start = System.currentTimeMillis();
    try {
       StringBuilder sb = new StringBuilder(contacts.getHeader()).append("\n");
       for (String lineItem : contacts.getLines()) {
          sb.append(lineItem).append("\n");
        }
        return new ByteArrayInputStream(sb.toString().getBytes());

     } finally {
        log.info("Duration is {}ms", System.currentTimeMillis() - start);
     }
}

Upvotes: 2

Related Questions