Ori Popowski
Ori Popowski

Reputation: 10672

Why my HTTP message ends before the body size reached the size stated in Content-Length header?

I wrote a tokenizer for HTTP messages in Java. It has a method nextToken() which supposed to return a string containing the whole HTTP message that was received. The problem is that the message ends before the expected body size has been read.

I read the input stream all the way to the beginning of the body. Then I try to read n bytes from the stream where n is the size in bytes of the body which is stated in the Content-Length header. The problem is that inside the while loop, the line charsRead = in.read(buffer) blocks because there is no more input in the input stream. But it happens before n bytes were read.

Example: In a body with size 12,493, it blocks when there are more 675 bytes expected to be read.

The input stream works with UTF-8 so every byte is encoded to one char.

/* Somewhere else in the code: 
InputStreamReader _isr =
     new InputStreamReader(clientSocket.getInputStream(), "UTF-8")
*/
BufferedReader in = new BufferedReader(_isr);
StringBuilder tmp = new StringBuilder();
String line = "";
boolean body = false;
int bodylen = -1;

for (;;) {
   line = in.readLine();

   if (line == null)
       break;
   if (line.equals("")) { /* We've reached the body */
       body = true;
       break;
   }

   tmp.append(line + "\r\n");

   if ((bodylen == -1) && (line.contains("Content-Length:"))) {
       /* Make `bodylen` hold the length of the body */
       String[] splitted = line.split("Content-Length:");
       bodylen = Integer.parseInt(splitted[1].trim());
   }
}

if (body == true) { 
    int charsRead;
    char[] buffer = new char[1024];

    while (bodylen > 0) {
        charsRead = in.read(buffer);
        if (charsRead == -1)
            break;
        bodylen -= charsRead;
        tmp.append(buffer);
    }
}

Why does it happen and how to solve it?

Upvotes: 1

Views: 1051

Answers (2)

user931366
user931366

Reputation: 694

You're using the wrong read() method. You should be using read(byte[], int start, int len) method.

Here's a sample helper of how you should be reading:

private void readAll(InputStream is, byte[] buffer){
    int read = 0;
    while (read != buffer.length){
       int ret = in.read(buffer, read, buffer.length - read);
       if (ret == -1) return;
       read += ret;
    }
}

What you're doing in your code is your asking the API to read 1024 bytes every time you call read. What's happened is the underlying InputStream can only read 675 bytes (it's a network call, so this can be expected), on your next iteration through the loop, you ask the API to read 1024 bytes again. The API reads the remaining (1024 - 675 bytes) and blocks until it fills the entire buffer, which it never can, cause you've split your read over 2 calls (your code also overwrites the previous read, since they both start at 0).

This is pretty normal behavior when dealing with network stuff, folks get so use to dealing with files, they find is odd when they cannot fully ready a buffer length.

Upvotes: 2

Julian Reschke
Julian Reschke

Reputation: 42065

It seems you are confusing characters with bytes. Content-Length is in bytes, but your are counting characters.

Upvotes: 3

Related Questions