ushka1
ushka1

Reputation: 895

How does parsing the `multipart/form-data` http request work?

Intro

Currently I'm working on simple file server written in Java that uses sockets for communication. During this project I got interested in format of http requests and would like to replicate this in my project. I'd like to do it on low-level apis, using only sockets to get a taste of how this all works under the hood.

tldr;

Question is pretty straightforward, it is located in last section of post. Everything else is explaination and my understanding of the problem.

Prerequisites

In below examples I'll be using simplified code with sockets to show how I understand things. I'm also assuming presence of below variables:

Socket socket = server.accept();
DataInputStream input = new DataInputStream(socket.getInputStream());
DataOutputStream output = new DataOutputStream(socket.getOutputStream());

Parsing Http Request

Okay, so parsing exemplary application/x-www-form-urlencoded http request (or similar) seems pretty understandable to me, but if I'm wrong please correct me. Having example request:

POST / HTTP/1.1
Content-Length: 64
Content-Type: application/x-www-form-urlencoded

name=John%20User&request=Send%20me%20one%20of%20your%20catalogue

Sample server can parse this request in that way:

// read start-line of request
String startLine = input.readline();
...

// read all headers till you encounter empty line
String header;
while (!(header = input.readLine()).equals("")) {
  ...
}

// read body
int len = <Content-Length header value>;
byte[] body = new byte[len];
input.read(body, 0, len);
...

Parsing multipart/form-data http request

And here is my main question. Let's have an exemplary multipart request.

POST / HTTP/1.1
Content-Type: multipart/form-data; boundary=boundary
Content-Length: 465

--boundary
Content-Disposition: form-data; name="name"

John
--boundary
Content-Disposition: form-data; name="avatar"; filename="avatar.jpg"
Content-Type: image/jpeg

<some binary data>
--boundary--

I'm not sure how parsing of such a request should look like. Start-line and headers can be parsed in similar way as in earlier example, but how to deal with body, especially when there is binary data in it. I had some ideas but consider them to be wrong/not sufficient.

My Attempt

My try was to read body as a string. Later this body could be divided into parts using value of boundary and then server could work on that separated parts (e.g. extract headers, do something with value and so on). It could look like that:

int len = <Content-Length header value>;
byte[] byteBody = new byte[len];
input.read(byteBody, 0, len);

String boundary = <extracted from header>;
String body = new String(byteBody);
String bodyParts = body.split(boundary)
...

And then I faced a problem, it wont' work for binaries. Conversion of byte[] to String and then again to byte[] (to write file on server) cannot work for files. That's because default encoding is ASCII and it doesn't support negative values. I did a small test, here are results.

byte[] arr1 = new byte[] { -1, -2, -3 };
String str1 = new String(arr1);
byte[] arr2 = str1.getBytes();

// arr1 = [-1, -2, -3]
// arr2 = [-17, -65, -67, -17, -65, -67, -17, -65, -67]

After gaining that knowledge I searched for a solution to this problem. I think that base64 encoding could resolve my problem but it looks like a workaround to me and has its drawbacks:

I also found many examples and did a simple node.js server to prove this that in case of multipart/form-data request body files definitely can be sent in binary format, not in base64 one.

Confusion

I'm a little bit confused now. I don't know how could I parse multipart/form-data request body so that I'm not converting it to string but still could break it into separate parts using value of boundary. I thought about reading this body byte by byte and somehow detect boundaries, but this doesn't seem to me as a good nor effective approach.

I'm really curious what's the correct way to accomplish that task and what is the standard for parsing that type of request bodies.

Upvotes: 2

Views: 7166

Answers (1)

Rob Spoor
Rob Spoor

Reputation: 9135

The body format is like this:

  1. -- followed by the boundary noted in the Content-Type header
  2. Any number of body headers, similar to the headers of the root HTTP request
  3. An empty line (\r\n), similar to that between the headers of the root HTTP request and the root body
  4. The part body
  5. Repeat 1-4 for other parts
  6. -- followed by the boundary followed by --

Instead of using split, I'd parse the body line by line; if you encounter a boundary you finish the previous part (nothing to be done for the first one). If you encounter the body end you're done.

Upvotes: 3

Related Questions