Reputation: 191
I have a program that needs to download small text files from a webpage, and I wrote the following code to do so:
URLConnection connection = null; // Connection to the URL data
InputStreamReader iSR = null; // Stream of the URL data
BufferedReader bR = null; // Reader of URL data
URL url = null; // URL based on the specified link
// Open the connection to the URL web page
url = new URL(urlLink);
connection = url.openConnection();
// Initialize the Readers
iSR = new InputStreamReader(connection.getInputStream());
bR = new BufferedReader(iSR);
// Fetch all of the lines from the buffered reader and join them all
// together into a single string.
return bR.lines().collect(Collectors.joining("\n"));
Unfortunately, the server from which I download the data has a very long TTFB wait time. According to the developer tools (F12), around 90% of the total download time is TTFB. This makes downloading in my Java program extremely slow if I have a large number of files to be downloaded. Basically, for each file, we open a connection, wait 250 ms, download, open a connection, wait another 250ms, download, which is very slow for a large number of files. I was able to decrease the issue using threads so that I have around 10 threads which each download a portion of all of the files that I need. This speeds up my program but it does not solve the fundamental issue I am having. Each thread still has to open a connection, wait 250ms, download, and repeat. My ideal solution would be to somehow send all of the requests at the same time and wait the 250ms for the TTFB time to finish and then download all of the data separately from the web page. The only way I can think of doing this is to create 1000 some threads and open a URL connection on each one but this seems like a really poor approach. Is there any other way to open multiple URL connections and let the TTFB period happen concurrently?
Upvotes: 1
Views: 1047
Reputation: 562
I think you are on the right track. Starting several threads to decrease total TTFB wait time sounds like a good idea. In order to avoid starting an extreme amount of threads you could consider using a design pattern like the Object Pool Pattern to limit the amount of threads being active at once to a certain number.
Upvotes: 1
Reputation: 13525
1000 threads would consume 500-1000 megabytes of core memory. This is the only drawback of this approach. If your computer have enough memory, this is the easiest and most reliable solution. If you do not want to spend so much memory (especially if you want more simultaneous connections), then you can use Java NIO API. It has 2 flavours, Nio1 and Nio2. Nio1 is too complex to use it directly, but has numerous wrapper libraries, e.g. Netty. Nio2 can be used directly. In either case, you can dramatically reduce memory consumption and keep reasonable number of worker threads (e.g. equal to the number of processor cores).
Upvotes: 0