Artyom
Artyom

Reputation: 31273

How to distinguish between mutli-part form data of empty file and missing file?

I try on server side to distinguish between the upload of empty file and not-uploaded file

The POST body content of an empty file is:

------WebKitFormBoundaryAYxCGhPMYcmdkdlv
Content-Disposition: form-data; name="_1"

dd
------WebKitFormBoundaryAYxCGhPMYcmdkdlv
Content-Disposition: form-data; name="_2"; filename="foo"
Content-Type: application/octet-stream


------WebKitFormBoundaryAYxCGhPMYcmdkdlv
Content-Disposition: form-data; name="_3"

Upload
------WebKitFormBoundaryAYxCGhPMYcmdkdlv--

While missing file is

------WebKitFormBoundaryMldAHhbBqWpKPlRY
Content-Disposition: form-data; name="_1"

dd
------WebKitFormBoundaryMldAHhbBqWpKPlRY
Content-Disposition: form-data; name="_2"; filename=""
Content-Type: application/octet-stream


------WebKitFormBoundaryMldAHhbBqWpKPlRY
Content-Disposition: form-data; name="_3"

Upload
------WebKitFormBoundaryMldAHhbBqWpKPlRY--

The only difference is filename= content one is empty other contains the file name (same behavior for Firefox and Chromium)

Questions:

  1. Are there any conditions that browser wouldn't provide a filename (security or something like that)?
  2. Is it actually valid/standard way to distinguish between empty file and non-set file, please provide reference.
  3. Is Content-Type: application/octet-stream is standard response and it will be set in case of non-uploaded file?

I'd like to see some references to standards that confirm or disprove my observations

Upvotes: 2

Views: 786

Answers (1)

Pete Barnett
Pete Barnett

Reputation: 331

1. Filename

The original specification for "Form-based File Upload in HTML" was RFC1867

According to that specification, the filename parameter

"is not required, but is strongly recommended in any case where the original filename is known"

This spec was superceded by RFC2388 - "Returning Values from Forms: multipart/form-data", which states

The sending application MAY supply a file name ... as specified in RFC2184;

(where RFC2184 goes on to stress its importance, without requiring it)

Note that it refers to "the sending application". The spec makes it clear that it is application agnostic. For a cross-browser view on actual implementation however, Mozilla's MDN documentation for FormData sheds some light on it. In the context of FormData.append() where a file/blob is set but filename is not set explicitly:

The default filename for Blob objects is "blob". The default filename for File objects is the file's filename.

2. Difference between empty & no file

To answer this, it's important to note Section 5.7 of RFC2388 - "Correlating form data with the original form"

This specification provides no specific mechanism by which multipart/form-data can be associated with the form that caused it to be transmitted. This separation is intentional...

This is answered in the HTML5 specification however, which details how form data is constructed.

...if the field element is an <input> element whose type attribute is in the File Upload state, then for each file selected in the <input> element, append an entry to the form data set with the name as the name, the file (consisting of the name, the type, and the body) as the value, and type as the type. If there are no selected files, then append an entry to the form data set with the name as the name, the empty string as the value, and application/octet-stream as the type.

This matches your observation above.

Looking to how a real-world server implementation deals with this, take the PHP runtime as an example. Its API makes no distinction between "no file" and "empty file" - and will raise a single error UPLOAD_ERR_NO_FILE in either case.

As PHP is open source (written in C), you can see that implementation here

3. MIME content-type encoding

This is answered in #2 above - As detailed in the HTML5 spec, (from a compliant browser) it will always be application/octet-stream where the form value is empty.

For completeness, if a file is provided, RFC2388 specifies that:

If the contents of a file are returned via filling out a form, then the file input is identified as the appropriate media type, if known, or "application/octet-stream"

Upvotes: 3

Related Questions