How to save a file sent from a http response without including the header

I am trying to write a program in java which is able to download a file from a URL. I want to do this without using an URLConnection, instead i am just using TCP sockets. I have succeeded in sending the GET request and picking up the server's response, but i can't seem to get my head around saving the file from the response without the http-header(just the file).

import java.net.*;
import java.io.*;

public class DownloadClient {
    public static void main(String[] args) {
        try {
            if (args.length != 3) {
                System.out.println(
                    "Use: java DownloadClient <host> <port> <filename/path>"
                );
            } else {
                // Sorting out arguments from the args array
                String host;
                int port; 
                String filename;
                if (args[0].charAt(args[0].length()-1) == '/') {
                    host = args[0].substring(0,args[0].length()-1);
                } else {
                    host = args[0];
                }
                port = Integer.parseInt(args[1]);
                if (args[2].charAt(0) == '/') {
                    filename = args[2];
                } else {
                    filename = "/"+args[2];
                }

                Socket con = new Socket(args[0], Integer.parseInt(args[1]));

                // GET request
                BufferedWriter out = new BufferedWriter(
                    new OutputStreamWriter(con.getOutputStream(), "UTF8")
                );
                out.write("GET "+filename+" HTTP/1.1\r\n");
                out.write("Host: "+host+"\r\n");
                out.write("User-Agent: Java DownloadClient\r\n\r\n");
                out.flush();

                InputStream in = con.getInputStream();
                BufferedReader = 
                OutputStream outputFile = new FileOutputStream(
                    filename.substring(filename.lastIndexOf('/')+1)
                );
                byte[] buffer = new byte[1024];
                int bytesRead = 0;

                while((bytesRead = in.read(buffer)) > 0) {
                    outputFile.write(buffer, 0, bytesRead);
                    buffer = new byte[1024];
                }

                outputFile.close();
                in.close();
                con.close();
            }
        } catch (IOException e) {
            System.err.println(e); 
        }
    }
}

I guess that i should somehow look for \r\n\r\n as it indicates the empty line just before the content begins. So far this program creates a file which contains all of the http-response.

Upvotes: 0

Views: 1766

Answers (1)

Stephen C
Stephen C

Reputation: 718926

The recommended way to do this is to NOT try to talk to a web server using a plain Socket. Use one of the existing client-side HTTP stack; e.g. the standard HttpUrlConnection stack or the Apache HttpClient stack.

If you insist on talking using a plain socket, then it is up to you to process / deal with the "Header" lines in any response ... and everything else ... in accordance with the HTTP specification.

I guess that I should somehow look for \r\n\r\n as it indicates the empty line just before the content begins.

Yup ...

And you also potentially need to deal with the server sending a compressed response, an response using an unexpected character set, a 3xx redirect, and so on.

Upvotes: 3

Related Questions