Pacerier
Pacerier

Reputation: 89643

Given a url String, how do I read all bytes into memory as fast as possible?

Hi all given a url String, I would like to read all the bytes (up to a specified number n) into memory as fast as possible.

I was wondering what's the best solution to this problem?

I have came up with two solutions, However because the internet connection is never constant, it is not possible to time the methods to see which is more time-efficient, so I was wondering by right, which of these two functions should be more time-efficient? :

public static int GetBytes(String url, byte[] destination) throws Exception {
    //read all bytes (up to destination.length) into destination starting from offset 0
    java.io.InputStream input_stream = new java.net.URL(url).openStream();
    int total_bytes_read = 0;
    int ubound = destination.length - 1;
    while (true) {
        int data = input_stream.read();
        if (data == -1) {
            break;
        }
        destination[total_bytes_read] =(byte) data;
        if (total_bytes_read == ubound) {
            break;
        }
        ++total_bytes_read;
    }
    input_stream.close();
    return total_bytes_read;
}

public static int GetBytes2(String url, byte[] destination) throws Exception {
    //read all bytes (up to destination.length) into destination starting from offset 0
    java.io.InputStream input_stream = new java.net.URL(url).openStream();
    int total_bytes_read = 0;
    while (true) {
        int bytes_to_read = destination.length - total_bytes_read;
        if (bytes_to_read == 0) {
            break;
        }
        int bytes_read = input_stream.read(destination, total_bytes_read, bytes_to_read);
        if (bytes_read == -1) {
            break;
        }
        total_bytes_read += bytes_read;
    }
    input_stream.close();
    return total_bytes_read;
}

Test code:

public final class Test {

    public static void main(String args[]) throws Exception {
        String url = "http://en.wikipedia.org/wiki/August_2010_in_sports"; // a really huuge page
        byte[] destination = new byte[3000000];
        long a = System.nanoTime();
        int bytes_read = GetBytes(url, destination);
        long b = System.nanoTime();
        System.out.println((b - a) / 1000000d);
    }
}

The results I had from my test code is this:

GetBytes:

12550.803514
12579.65927
12630.308032
12376.435205
12903.350407
12637.59136
12671.536975
12503.170865

GetBytes2:

12866.636589
12372.011314
12505.079466
12514.486199
12380.704728
19126.36572
12294.946634
12613.454368

Basically, I was wondering if anyone know of a better way to read all bytes from a url into memory using as little time as possible?

Upvotes: 0

Views: 1012

Answers (2)

vikiiii
vikiiii

Reputation: 9476

I will suggest you to use JSOUP java HTML parser. I tried your given URL with your code using JSOUP PARSER. And time taken is around 1/4th of time taken.

      long a = System.nanoTime();
      Document doc = Jsoup.connect("http://en.wikipedia.org/wiki/August_2010_in_sports").get();
      String title = doc.title();
   // System.out.println(doc.html());  // will print whole html code
      System.out.println(title);
      long b = System.nanoTime();
      System.out.println( "Time Taken  "  +    (b - a) / 1000000d);

Output:

August 2010 in sports - Wikipedia, the free encyclopedia
Time Taken  3842.634244

Try this. You need to download JAR files for using JSOUP.

Upvotes: 1

xikkub
xikkub

Reputation: 1660

The more bytes you read at once, the faster they are read. Every read() call polls your input device and creates massive overhead if you do it repeatedly. GetBytes2() is faster than GetBytes(). Threading might also increase your read speeds, but the best solution is to optimize your algorithm.

Upvotes: 1

Related Questions