quarks
quarks

Reputation: 35276

Fastest way to read from InputStream to OutputStream

This code below streams at 1.3 seconds for a 2.43 MB file

byte[] buff = new byte[64*1024];

private static void flow(InputStream is, OutputStream os, byte[] buf )
        throws IOException {
    int numRead;
    while ( (numRead = is.read(buf) ) >= 0) {
        os.write(buf, 0, numRead);
    }
}

What is the fastest way to "stream" an InputStream to OutputStream?

Update:

Data source is a cache, EHCache to be specific:

byte[] cached = cacheService.get(cacheKey); // Just `2 ms` to get the bytes, very fast
if(cached != null && cached.length > 0) {
    flow(ByteSource.wrap(cached).openStream(), outputStream, buff);
}

Upvotes: 0

Views: 2025

Answers (3)

NoDataFound
NoDataFound

Reputation: 11959

I would also have said commons-io: IOUtils::copy which does this probably better than a naive approach, but the code seems to do the same as yours (see copyLarge) but answer about Java 9 makes it a better choice.

public static long copyLarge(final InputStream input, final OutputStream output, final byte[] buffer)
        throws IOException {
    long count = 0;
    int n;
    while (EOF != (n = input.read(buffer))) {
        output.write(buffer, 0, n);
        count += n;
    }
    return count;
}

However, your problem may not be how you copy, but rather the lack of buffering: you could try with BufferedInputStream and BufferedOutputStream on top of existing stream:

  • Files.newInputStream is not buffered.
  • Files.newOutputStream is not buffered.
  • You could use FileChannel and ByteBuffer.
  • System is probably buffering file on its side.

You should roll up a JMH benchmark test:

  • Not sure how you can disable system buffering. I don't think it is a problem.
  • I would first check result with buffered input stream of various size (8K, 16K, 32K, 64K, 512K, 1M, 2M, 4M, 8M)
  • Then with buffered output stream
  • Then with a mix of two.

While it may take time to execute, the road to what the fastest implies measuring.

Upvotes: 1

Nullish Byte
Nullish Byte

Reputation: 396

Since Java 9, InputStream provides a transferTo(OutStream) method or using Java 7 Files can also be used. Again no claims on which is the fastest but you can benchmark these as well.

References:

  1. Official Documentation
  2. A similar Question

Upvotes: 2

Keynan
Keynan

Reputation: 1346

I can't make any assertion that it's the fastest but I would suggest using apache-commons-io's IOUtils. Specifically

public static long copy(InputStream input, OutputStream output, int bufferSize)

and try to benchmark with different values of bufferSize.

https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/IOUtils.html#copy(java.io.InputStream,%20java.io.OutputStream,%20int)

The real problem here is the high level of abstraction you're working with. Provided you know exactly where the data is coming from (e.g. the file system) and where it's going (e.g network socket) and you know which operating system you're working on, it is possible to leverage the kernel's stream support to make this much faster.

Googling for "zero copy kernel io" I found this article which is an okay overview: https://xunnanxu.github.io/2016/09/10/It-s-all-about-buffers-zero-copy-mmap-and-Java-NIO/

Upvotes: 2

Related Questions