Asynchronous Web Requests in Java?

Question

I am writing a simple web crawler in Java. I want it to be able to download as many pages per second as possible. Is there a package out there that makes doing asynchronous HTTP web requests easy in Java? I have used the HttpURLConnection but that is blocking. I also know there is something in Apache's HTTPCore NIO, but I am looking for something more lightweight. I tried using this package and I was getting better throughput using the HttpURLConnection on multiple threads.

ok2c · Accepted Answer

Generally data intensive protocols tend to perform better in terms of a raw throughput with the classic blocking I/O compared than NIO as long as the number of threads is below 1000. At least that is certainly the case with the client side HTTP based on (likely imperfect and possibly biased) HTTP benchmark used by Apache HttpClient [1]

One may be much better off using a blocking HTTP client with threads as long as the number of threads is moderate (<250)

If you are absolutely sure you want a NIO based HTTP client I can recommend Jetty HTTP client which I personally consider the best asynchronous HTTP client at the moment.

[1] http://wiki.apache.org/HttpComponents/HttpClient3vsHttpClient4vsHttpCore

Asynchronous Web Requests in Java?

Answers (2)

Related Questions