Reputation: 11596
There is an online file (such as http://www.example.com/information.asp
) I need to grab and save to a directory. I know there are several methods for grabbing and reading online files (URLs) line-by-line, but is there a way to just download and save the file using Java?
Upvotes: 479
Views: 768818
Reputation: 633
One liner with the built-in Java HTTP Client added in Java 11:
URI url = ...;
Path path = ...; // output file path
HttpClient.newHttpClient().send(HttpRequest.newBuilder(url).build(), HttpResponse.BodyHandlers.ofFile(path));
If you're making multiple requests, you can reuse the HttpClient and the body handler.
You can also customize the request by adding parameters to the HttpRequest
and customize writing/creating the file by adding parameters to the HttpResponse.BodyHandlers.ofFile
method.
No need to worry about closing resources.
Upvotes: 4
Reputation: 2097
There is a method, U.fetch(url)
, in the underscore-java library.
File pom.xml:
<dependency>
<groupId>com.github.javadev</groupId>
<artifactId>underscore</artifactId>
<version>1.84</version>
</dependency>
Code example:
import com.github.underscore.U;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
public class Download {
public static void main(String[] args) throws IOException {
Files.write(Paths.get("data.bin"),
U.fetch("https://stackoverflow.com/questions"
+ "/921262/how-to-download-and-save-a-file-from-internet-using-java").blob());
}
}
Upvotes: 1
Reputation: 1141
If you are behind a proxy, you can set the proxies in the Java program as below:
Properties systemSettings = System.getProperties();
systemSettings.put("proxySet", "true");
systemSettings.put("https.proxyHost", "HTTPS proxy of your org");
systemSettings.put("https.proxyPort", "8080");
If you are not behind a proxy, don't include the lines above in your code. Full working code to download a file when you are behind a proxy.
public static void main(String[] args) throws IOException {
String url = "https://raw.githubusercontent.com/bpjoshi/fxservice/master/src/test/java/com/bpjoshi/fxservice/api/TradeControllerTest.java";
OutputStream outStream = null;
URLConnection connection = null;
InputStream is = null;
File targetFile = null;
URL server = null;
// Setting up proxies
Properties systemSettings = System.getProperties();
systemSettings.put("proxySet", "true");
systemSettings.put("https.proxyHost", "HTTPS proxy of my organisation");
systemSettings.put("https.proxyPort", "8080");
// The same way we could also set proxy for HTTP
System.setProperty("java.net.useSystemProxies", "true");
// Code to fetch file
try {
server = new URL(url);
connection = server.openConnection();
is = connection.getInputStream();
byte[] buffer = new byte[is.available()];
is.read(buffer);
targetFile = new File("src/main/resources/targetFile.java");
outStream = new FileOutputStream(targetFile);
outStream.write(buffer);
} catch (MalformedURLException e) {
System.out.println("THE URL IS NOT CORRECT ");
e.printStackTrace();
} catch (IOException e) {
System.out.println("I/O exception");
e.printStackTrace();
}
finally{
if(outStream != null)
outStream.close();
}
}
Upvotes: -1
Reputation: 75886
To summarize (and somehow polish and update) previous answers. The three following methods are practically equivalent. (I added explicit timeouts, because I think they are a must. Nobody wants a download to freeze forever when the connection is lost.)
public static void saveUrl1(final Path file, final URL url,
int secsConnectTimeout, int secsReadTimeout))
throws MalformedURLException, IOException {
// Files.createDirectories(file.getParent()); // Optional, make sure parent directory exists
try (BufferedInputStream in = new BufferedInputStream(
streamFromUrl(url, secsConnectTimeout,secsReadTimeout));
OutputStream fout = Files.newOutputStream(file)) {
final byte data[] = new byte[8192];
int count;
while((count = in.read(data)) > 0)
fout.write(data, 0, count);
}
}
public static void saveUrl2(final Path file, final URL url,
int secsConnectTimeout, int secsReadTimeout))
throws MalformedURLException, IOException {
// Files.createDirectories(file.getParent()); // Optional, make sure parent directory exists
try (ReadableByteChannel rbc = Channels.newChannel(
streamFromUrl(url, secsConnectTimeout, secsReadTimeout)
);
FileChannel channel = FileChannel.open(file,
StandardOpenOption.CREATE,
StandardOpenOption.TRUNCATE_EXISTING,
StandardOpenOption.WRITE)
) {
channel.transferFrom(rbc, 0, Long.MAX_VALUE);
}
}
public static void saveUrl3(final Path file, final URL url,
int secsConnectTimeout, int secsReadTimeout))
throws MalformedURLException, IOException {
// Files.createDirectories(file.getParent()); // Optional, make sure parent directory exists
try (InputStream in = streamFromUrl(url, secsConnectTimeout,secsReadTimeout) ) {
Files.copy(in, file, StandardCopyOption.REPLACE_EXISTING);
}
}
public static InputStream streamFromUrl(URL url,int secsConnectTimeout,int secsReadTimeout) throws IOException {
URLConnection conn = url.openConnection();
if(secsConnectTimeout>0)
conn.setConnectTimeout(secsConnectTimeout*1000);
if(secsReadTimeout>0)
conn.setReadTimeout(secsReadTimeout*1000);
return conn.getInputStream();
}
I don't find significant differences, and all seem right to me. They are safe and efficient. (Differences in speed seem hardly relevant - I write 180 MB from the local server to a SSD disk in times that fluctuate around 1.2 to 1.5 secs). They don't require external libraries. All work with arbitrary sizes and (to my experience) HTTP redirections.
Additionally, all throw FileNotFoundException
if the resource is not found (error 404, typically), and java.net.UnknownHostException
if the DNS resolution failed; other IOException correspond to errors during transmission.
Upvotes: 1
Reputation: 19175
When using Java 7+, use the following method to download a file from the Internet and save it to some directory:
private static Path download(String sourceURL, String targetDirectory) throws IOException
{
URL url = new URL(sourceURL);
String fileName = sourceURL.substring(sourceURL.lastIndexOf('/') + 1, sourceURL.length());
Path targetPath = new File(targetDirectory + File.separator + fileName).toPath();
Files.copy(url.openStream(), targetPath, StandardCopyOption.REPLACE_EXISTING);
return targetPath;
}
Documentation is here.
Upvotes: 19
Reputation: 1449
This answer is almost exactly like the selected answer, but with two enhancements: it's a method and it closes out the FileOutputStream object:
public static void downloadFileFromURL(String urlString, File destination) {
try {
URL website = new URL(urlString);
ReadableByteChannel rbc;
rbc = Channels.newChannel(website.openStream());
FileOutputStream fos = new FileOutputStream(destination);
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
fos.close();
rbc.close();
} catch (IOException e) {
e.printStackTrace();
}
}
Upvotes: 17
Reputation: 1049
There is an issue with simple usage of:
org.apache.commons.io.FileUtils.copyURLToFile(URL, File)
if you need to download and save very large files, or in general if you need automatic retries in case connection is dropped.
I suggest Apache HttpClient in such cases, along with org.apache.commons.io.FileUtils. For example:
GetMethod method = new GetMethod(resource_url);
try {
int statusCode = client.executeMethod(method);
if (statusCode != HttpStatus.SC_OK) {
logger.error("Get method failed: " + method.getStatusLine());
}
org.apache.commons.io.FileUtils.copyInputStreamToFile(
method.getResponseBodyAsStream(), new File(resource_file));
} catch (HttpException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
method.releaseConnection();
}
Upvotes: 0
Reputation: 32004
Use Apache Commons IO. It is just one line of code:
FileUtils.copyURLToFile(URL, File)
Upvotes: 545
Reputation: 7168
Downloading a file requires you to read it. Either way, you will have to go through the file in some way. Instead of line by line, you can just read it by bytes from the stream:
BufferedInputStream in = new BufferedInputStream(new URL("http://www.website.com/information.asp").openStream())
byte data[] = new byte[1024];
int count;
while((count = in.read(data, 0, 1024)) != -1)
{
out.write(data, 0, count);
}
Upvotes: 23
Reputation: 11799
Here is a concise, readable, JDK-only solution with properly closed resources:
static long download(String url, String fileName) throws IOException {
try (InputStream in = URI.create(url).toURL().openStream()) {
return Files.copy(in, Paths.get(fileName));
}
}
Two lines of code and no dependencies.
Here's a complete file downloader example program with output, error checking, and command line argument checks:
package so.downloader;
import java.io.IOException;
import java.io.InputStream;
import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Paths;
public class Application {
public static void main(String[] args) throws IOException {
if (2 != args.length) {
System.out.println("USAGE: java -jar so-downloader.jar <source-URL> <target-filename>");
System.exit(1);
}
String sourceUrl = args[0];
String targetFilename = args[1];
long bytesDownloaded = download(sourceUrl, targetFilename);
System.out.println(String.format("Downloaded %d bytes from %s to %s.", bytesDownloaded, sourceUrl, targetFilename));
}
static long download(String url, String fileName) throws IOException {
try (InputStream in = URI.create(url).toURL().openStream()) {
return Files.copy(in, Paths.get(fileName));
}
}
}
As noted in the so-downloader repository README:
To run file download program:
java -jar so-downloader.jar <source-URL> <target-filename>
For example:
java -jar so-downloader.jar https://github.com/JanStureNielsen/so-downloader/archive/main.zip so-downloader-source.zip
Upvotes: 34
Reputation: 29
Solution on java.net.http.HttpClient using Authorization:
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.GET()
.header("Accept", "application/json")
// .header("Authorization", "Basic ci5raG9kemhhZXY6NDdiYdfjlmNUM=") if you need
.uri(URI.create("https://jira.google.ru/secure/attachment/234096/screenshot-1.png"))
.build();
HttpResponse<InputStream> response = client.send(request, HttpResponse.BodyHandlers.ofInputStream());
try (InputStream in = response.body()) {
Files.copy(in, Paths.get(target + "screenshot-1.png"), StandardCopyOption.REPLACE_EXISTING);
}
Upvotes: 3
Reputation: 130
This can read a file on the Internet and write it into a file.
import java.net.URL;
import java.io.FileOutputStream;
import java.io.File;
public class Download {
public static void main(String[] args) throws Exception {
URL url = new URL("https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png"); // Input URL
FileOutputStream out = new FileOutputStream(new File("out.png")); // Output file
out.write(url.openStream().readAllBytes());
out.close();
}
}
Upvotes: 0
Reputation: 191
First method using the new channel
ReadableByteChannel aq = Channels.newChannel(new url("https//asd/abc.txt").openStream());
FileOutputStream fileOS = new FileOutputStream("C:Users/local/abc.txt")
FileChannel writech = fileOS.getChannel();
Second method using FileUtils
FileUtils.copyURLToFile(new url("https//asd/abc.txt", new local file on system("C":/Users/system/abc.txt"));
Third method using
InputStream xy = new ("https//asd/abc.txt").openStream();
This is how we can download file by using basic Java code and other third-party libraries. These are just for quick reference. Please google with the above keywords to get detailed information and other options.
Upvotes: -1
Reputation: 176
Below is the sample code to download a movie from the Internet with Java code:
URL url = new
URL("http://103.66.178.220/ftp/HDD2/Hindi%20Movies/2018/Hichki%202018.mkv");
BufferedInputStream bufferedInputStream = new BufferedInputStream(url.openStream());
FileOutputStream stream = new FileOutputStream("/home/sachin/Desktop/test.mkv");
int count = 0;
byte[] b1 = new byte[100];
while((count = bufferedInputStream.read(b1)) != -1) {
System.out.println("b1:" + b1 + ">>" + count + ">> KB downloaded:" + new File("/home/sachin/Desktop/test.mkv").length()/1024);
stream.write(b1, 0, count);
}
Upvotes: 1
Reputation: 448
You can do this in one line using netloader for Java:
new NetFile(new File("my/zips/1.zip"), "https://example.com/example.zip", -1).load(); // Returns true if succeed, otherwise false.
Upvotes: 0
Reputation: 14584
It's possible to download the file with with Apache's HttpComponents
instead of Commons IO. This code allows you to download a file in Java according to its URL and save it at the specific destination.
public static boolean saveFile(URL fileURL, String fileSavePath) {
boolean isSucceed = true;
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet(fileURL.toString());
httpGet.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0");
httpGet.addHeader("Referer", "https://www.google.com");
try {
CloseableHttpResponse httpResponse = httpClient.execute(httpGet);
HttpEntity fileEntity = httpResponse.getEntity();
if (fileEntity != null) {
FileUtils.copyInputStreamToFile(fileEntity.getContent(), new File(fileSavePath));
}
} catch (IOException e) {
isSucceed = false;
}
httpGet.releaseConnection();
return isSucceed;
}
In contrast to the single line of code:
FileUtils.copyURLToFile(fileURL, new File(fileSavePath),
URLS_FETCH_TIMEOUT, URLS_FETCH_TIMEOUT);
This code will give you more control over a process and let you specify not only time-outs, but User-Agent
and Referer
values, which are critical for many websites.
Upvotes: 2
Reputation: 10362
This is another Java 7 variant based on Brian Risk's answer with usage of a try-with statement:
public static void downloadFileFromURL(String urlString, File destination) throws Throwable {
URL website = new URL(urlString);
try(
ReadableByteChannel rbc = Channels.newChannel(website.openStream());
FileOutputStream fos = new FileOutputStream(destination);
) {
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
}
}
Upvotes: 6
Reputation: 3623
Simpler non-blocking I/O usage:
URL website = new URL("http://www.website.com/information.asp");
try (InputStream in = website.openStream()) {
Files.copy(in, target, StandardCopyOption.REPLACE_EXISTING);
}
Upvotes: 132
Reputation: 2818
import java.io.*;
import java.net.*;
public class filedown {
public static void download(String address, String localFileName) {
OutputStream out = null;
URLConnection conn = null;
InputStream in = null;
try {
URL url = new URL(address);
out = new BufferedOutputStream(new FileOutputStream(localFileName));
conn = url.openConnection();
in = conn.getInputStream();
byte[] buffer = new byte[1024];
int numRead;
long numWritten = 0;
while ((numRead = in.read(buffer)) != -1) {
out.write(buffer, 0, numRead);
numWritten += numRead;
}
System.out.println(localFileName + "\t" + numWritten);
}
catch (Exception exception) {
exception.printStackTrace();
}
finally {
try {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
catch (IOException ioe) {
}
}
}
public static void download(String address) {
int lastSlashIndex = address.lastIndexOf('/');
if (lastSlashIndex >= 0 &&
lastSlashIndex < address.length() - 1) {
download(address, (new URL(address)).getFile());
}
else {
System.err.println("Could not figure out local file name for "+address);
}
}
public static void main(String[] args) {
for (int i = 0; i < args.length; i++) {
download(args[i]);
}
}
}
Upvotes: 10
Reputation: 116304
Give Java NIO a try:
URL website = new URL("http://www.website.com/information.asp");
ReadableByteChannel rbc = Channels.newChannel(website.openStream());
FileOutputStream fos = new FileOutputStream("information.html");
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
Using transferFrom()
is potentially much more efficient than a simple loop that reads from the source channel and writes to this channel. Many operating systems can transfer bytes directly from the source channel into the filesystem cache without actually copying them.
Check more about it here.
Note: The third parameter in transferFrom is the maximum number of bytes to transfer. Integer.MAX_VALUE
will transfer at most 2^31 bytes, Long.MAX_VALUE
will allow at most 2^63 bytes (larger than any file in existence).
Upvotes: 589
Reputation: 1
public class DownloadManager {
static String urls = "[WEBSITE NAME]";
public static void main(String[] args) throws IOException{
URL url = verify(urls);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
InputStream in = null;
String filename = url.getFile();
filename = filename.substring(filename.lastIndexOf('/') + 1);
FileOutputStream out = new FileOutputStream("C:\\Java2_programiranje/Network/DownloadTest1/Project/Output" + File.separator + filename);
in = connection.getInputStream();
int read = -1;
byte[] buffer = new byte[4096];
while((read = in.read(buffer)) != -1){
out.write(buffer, 0, read);
System.out.println("[SYSTEM/INFO]: Downloading file...");
}
in.close();
out.close();
System.out.println("[SYSTEM/INFO]: File Downloaded!");
}
private static URL verify(String url){
if(!url.toLowerCase().startsWith("http://")) {
return null;
}
URL verifyUrl = null;
try{
verifyUrl = new URL(url);
}catch(Exception e){
e.printStackTrace();
}
return verifyUrl;
}
}
Upvotes: -2
Reputation: 75886
There are many elegant and efficient answers here. But the conciseness can make us lose some useful information. In particular, one often does not want to consider a connection error an Exception, and one might want to treat differently some kind of network-related errors - for example, to decide if we should retry the download.
Here's a method that does not throw Exceptions for network errors (only for truly exceptional problems, as malformed url or problems writing to the file)
/**
* Downloads from a (http/https) URL and saves to a file.
* Does not consider a connection error an Exception. Instead it returns:
*
* 0=ok
* 1=connection interrupted, timeout (but something was read)
* 2=not found (FileNotFoundException) (404)
* 3=server error (500...)
* 4=could not connect: connection timeout (no internet?) java.net.SocketTimeoutException
* 5=could not connect: (server down?) java.net.ConnectException
* 6=could not resolve host (bad host, or no internet - no dns)
*
* @param file File to write. Parent directory will be created if necessary
* @param url http/https url to connect
* @param secsConnectTimeout Seconds to wait for connection establishment
* @param secsReadTimeout Read timeout in seconds - trasmission will abort if it freezes more than this
* @return See above
* @throws IOException Only if URL is malformed or if could not create the file
*/
public static int saveUrl(final Path file, final URL url,
int secsConnectTimeout, int secsReadTimeout) throws IOException {
Files.createDirectories(file.getParent()); // make sure parent dir exists , this can throw exception
URLConnection conn = url.openConnection(); // can throw exception if bad url
if( secsConnectTimeout > 0 ) conn.setConnectTimeout(secsConnectTimeout * 1000);
if( secsReadTimeout > 0 ) conn.setReadTimeout(secsReadTimeout * 1000);
int ret = 0;
boolean somethingRead = false;
try (InputStream is = conn.getInputStream()) {
try (BufferedInputStream in = new BufferedInputStream(is); OutputStream fout = Files
.newOutputStream(file)) {
final byte data[] = new byte[8192];
int count;
while((count = in.read(data)) > 0) {
somethingRead = true;
fout.write(data, 0, count);
}
}
} catch(java.io.IOException e) {
int httpcode = 999;
try {
httpcode = ((HttpURLConnection) conn).getResponseCode();
} catch(Exception ee) {}
if( somethingRead && e instanceof java.net.SocketTimeoutException ) ret = 1;
else if( e instanceof FileNotFoundException && httpcode >= 400 && httpcode < 500 ) ret = 2;
else if( httpcode >= 400 && httpcode < 600 ) ret = 3;
else if( e instanceof java.net.SocketTimeoutException ) ret = 4;
else if( e instanceof java.net.ConnectException ) ret = 5;
else if( e instanceof java.net.UnknownHostException ) ret = 6;
else throw e;
}
return ret;
}
Upvotes: 3
Reputation: 34928
public void saveUrl(final String filename, final String urlString)
throws MalformedURLException, IOException {
BufferedInputStream in = null;
FileOutputStream fout = null;
try {
in = new BufferedInputStream(new URL(urlString).openStream());
fout = new FileOutputStream(filename);
final byte data[] = new byte[1024];
int count;
while ((count = in.read(data, 0, 1024)) != -1) {
fout.write(data, 0, count);
}
} finally {
if (in != null) {
in.close();
}
if (fout != null) {
fout.close();
}
}
}
You'll need to handle exceptions, probably external to this method.
Upvotes: 90
Reputation: 1349
Personally, I've found Apache's HttpClient to be more than capable of everything I've needed to do with regards to this. Here is a great tutorial on using HttpClient
Upvotes: 9