Ada
Ada

Reputation: 624

Why aren't two files that should be equal not equal? I download a file with Java Socket, and compare it with the same file downloaded with Mozilla

I am downloading a file using Java Socket. The code is only for testing whether the files are equal or not. It is part of a bigger project.

import java.io.BufferedOutputStream;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.PrintWriter;
import java.net.Socket;
import java.util.Arrays;


public class Compare {

/**
 * @param args
 * @throws IOException 
 */
public static void main(String[] args) throws IOException {
    Socket sk = null;
    try
    {
          sk = new Socket("wlab.cs.bilkent.edu.tr", 80);
          if (sk.isConnected())
          {

              PrintWriter out = new PrintWriter(sk.getOutputStream(),true);
              out.println("GET /" + "/PA2/test5MB.bin" + " HTTP/1.1");
              out.println("Host: " + 80);
             // out.println("Range: " + 0 + '-' + 5242879);
              out.println("Connection: close\r\n");
              out.println("");
              out.flush();

          }
          }catch(Exception e){

          }


    // receive file
    byte [] mybytearray  = new byte [5242880]; // when i use HEAD, this size is returned.

   // FileInputStream is = new FileInputStream("C:\\Users\\Eda\\workspace\\network1\\t1est5MB.bin");
    InputStream is = sk.getInputStream(); //when I used this one, the files don't match, but the above declaration works.

    FileOutputStream fos = new FileOutputStream( new File ("C:\\Users\\Eda\\workspace\\network1\\tesst5MB.bin"));
    BufferedOutputStream bos = new BufferedOutputStream(fos);


    int offset3 = 0;
    int numRead3 = 0;
    System.out.println("1 in length: " +mybytearray.length);

  while(offset3 < mybytearray.length
            && (numRead3=is.read(mybytearray, offset3, mybytearray.length-offset3)) >= 0  )


    {
        offset3 += numRead3;
      //  is=sk.getInputStream();
    }
    bos.write(mybytearray);
    is.close();

    bos.close();

    //rest is for comparing two binary files.works except for the file I downloaded in this code above.
    try{
        File filename=new File("C:\\Users\\Eda\\workspace\\network1\\tesst5MB" +".bin");
        File filename2=new File("C:\\Users\\Eda\\workspace\\network1\\t1est5MB" +".bin");
        if(filename.exists() && filename2.exists())
            System.out.println("both exists");

        int size = (int)filename.length(); 
        System.out.println("size1: " + size);
        byte[] byteArray1 = new byte[size];
        size = (int)filename2.length(); 

        System.out.println("size2: " + size);
        byte[] byteArray2 = new byte[size];

        DataInputStream infile1 = new DataInputStream(new FileInputStream(filename));


        DataInputStream infile2 = new DataInputStream(new FileInputStream(filename2));


        int offset1 = 0;
        int numRead1 = 0;


        int offset2 = 0;
        int numRead2 = 0;
        System.out.println("1 in length: " +byteArray1.length);

      while(offset1 < byteArray1.length
                && (numRead1=infile1.read(byteArray1, offset1, byteArray1.length-offset1)) >= 0)


        {
            offset1 += numRead1;

        }

        infile1.close();


        while(offset2 < byteArray2.length
                && (numRead2=infile2.read(byteArray2, offset2, byteArray2.length-offset2)) >= 0)


        {
            offset2 += numRead2;

        }

        infile2.close();


        System.out.println(Arrays.equals(byteArray1,byteArray2));
    }
    catch(Exception e){

    }       

}

}

When I copy a file by reading it byte by byte and compare, they are equal. But when I download the file through Socket and compare it to the file that I download with Mozilla (they should be equal) they are not equal. I don't know what is wrong with my sk.getInputStream(). I am stuck here. Can you tell me how? I don't know what to do. They are just not equal and I don't know why. The filesize I gave is the file size when I used HEAD request. 5 MB

Upvotes: 2

Views: 233

Answers (1)

Lou Franco
Lou Franco

Reputation: 89192

When you connect with a raw socket, you are going to get HTTP headers, that you need to parse, and not save into the output file. The real data starts after you see two newlines in a row, indicating the end of the headers. You should be reading Content-length from the headers, not hard-coding.

If you open up your downloaded file (the one you made by using Socket), you'll see those HTTP headers.

If this is meant as production code (not just playing around or a one-shot deal), you should not be using Socket for HTTP. HTTP is much more complex than what you have implemented here. At the very least, you should be checking the result code to make sure you got a 200.

Take a look at java.net.URLConnection

Upvotes: 3

Related Questions