Doug
Doug

Reputation: 35136

How can you upload files as a stream in go?

There are a number of tutorials about posting files using http.Request in go, but almost invariably they start like this:

file, err := os.Open(path)
if err != nil {
    return nil, err
}
fileContents, err := ioutil.ReadAll(file)

Which is to say, you read the entire file into memory, and then convert it into a Buffer and pass that into a request, something like this:

func send(client *http.Client, file *os.File, endpoint string) {
    body := &bytes.Buffer{}
    io.Copy(body, file)
    req, _ := http.NewRequest("POST", endpoint, body)
    resp, _ := client.Do(req)
}

If you wanted to post a massive file and avoid reading it into memory, but instead steam the file up in chunks... how would you do that?

Upvotes: 14

Views: 20246

Answers (3)

abdusco
abdusco

Reputation: 11081

If the request must have a Content-Length header (most file hosts reject upload requests without it), and you want to upload the file as a stream (without loading all to memory), standard library won't help you, and you have to calculate it yourself.

Here's a minimal working example (without error checks) that uses io.MultiReader to connect os.File with other fields while keeping a tab on the request size.

It supports regular fields (with string content) and file fields, and calculates the total request body size. It's easy to extend it with other value types by simply adding a new case branch.

import (
    "crypto/rand"
    "fmt"
    "io"
    "io/fs"
    "mime"
    "path/filepath"
    "strings"
)

type multipartPayload struct {
    headers map[string]string
    body    io.Reader
    size    int64
}

func randomBoundary() string {
    var buf [8]byte
    _, err := io.ReadFull(rand.Reader, buf[:])
    if err != nil {
        panic(err)
    }
    return fmt.Sprintf("%x", buf[:])
}

// Multipart request has the following structure:
//  POST /upload HTTP/1.1
//  Other-Headers: ...
//  Content-Type: multipart/form-data; boundary=$boundary
//  \r\n
//  --$boundary\r\n    👈 request body starts here 
//  Content-Disposition: form-data; name="field1"\r\n
//  Content-Type: text/plain; charset=utf-8\r\n
//  Content-Length: 4\r\n
//  \r\n
//  $content\r\n
//  --$boundary\r\n
//  Content-Disposition: form-data; name="field2"\r\n
//  ...
//  --$boundary--\r\n
func prepareMultipartPayload(fields map[string]interface{}) (*multipartPayload, error) {
    boundary := randomBoundary()
    headers := make(map[string]string)
    totalSize := 0
    headers["Content-Type"] = fmt.Sprintf("multipart/form-data; boundary=%s", boundary)

    parts := make([]io.Reader, 0)
    CRLF := "\r\n"

    fieldBoundary := "--" + boundary + CRLF

    for k, v := range fields {
        parts = append(parts, strings.NewReader(fieldBoundary))
        totalSize += len(fieldBoundary)
        if v == nil {
            continue
        }
        switch v.(type) {
        case string:
            header := fmt.Sprintf(`Content-Disposition: form-data; name="%s"`, k)
            parts = append(
                parts,
                strings.NewReader(header+CRLF+CRLF),
                strings.NewReader(v.(string)),
                strings.NewReader(CRLF),
            )
            totalSize += len(header) + 2*len(CRLF) + len(v.(string)) + len(CRLF)
            continue
        case fs.File:
            stat, _ := v.(fs.File).Stat()
            contentType := mime.TypeByExtension(filepath.Ext(stat.Name()))
            header := strings.Join([]string{
                fmt.Sprintf(`Content-Disposition: form-data; name="%s"; filename="%s"`, k, stat.Name()),
                fmt.Sprintf(`Content-Type: %s`, contentType),
                fmt.Sprintf(`Content-Length: %d`, stat.Size()),
            }, CRLF)
            parts = append(
                parts,
                strings.NewReader(header+CRLF+CRLF),
                v.(fs.File),
                strings.NewReader(CRLF),
            )
            totalSize += len(header) + 2*len(CRLF) + int(stat.Size()) + len(CRLF)
            continue
        }
    }
    finishBoundary := "--" + boundary + "--" + CRLF
    parts = append(parts, strings.NewReader(finishBoundary))
    totalSize += len(finishBoundary)

    headers["Content-Length"] = fmt.Sprintf("%d", totalSize)

    return &multipartPayload{headers, io.MultiReader(parts...), int64(totalSize)}, nil
}

then prepare the request, set the content length and send it:

file, err := os.Open("/path/to/file.ext")
if err != nil {
    return nil, err
}
defer file.Close()

up, err := prepareMultipartPayload(map[string]interface{}{
    "a_string":      "field",
    "another_field": "yep",
    "file":          file,  // you can have multiple file fields
})
r, _ := http.NewRequest("POST", "https://example.com/upload", up.body)
for k, v := range up.headers {
    r.Header.Set(k, v)
}
r.ContentLength = up.size
c := http.Client{}
res, err := c.Do(r)

Upvotes: 0

Doug
Doug

Reputation: 35136

Turns out you can actually pass the *File (or any stream-like) object straight into NewRequest.

Notice the caveat however, that NewRequest (as shown here: https://golang.org/src/net/http/request.go?s=21674:21746#L695) won't actually set the ContentLength unless the stream is explicitly one of:

  • *bytes.Buffer
  • *bytes.Reader
  • *strings.Reader

Since *File isn't one of these, the request will be sent without a content length unless you manually set it, which may cause some servers to discard the body of the incoming request, resulting in a body of '' on the server when it appears to have been correctly sent from the go side.

Upvotes: 2

putu
putu

Reputation: 6444

If you need to set Content-Length, it can be done manually. The following snippet is an example of uploading file and extra parameters as a stream (the code based on Buffer-less Multipart POST in Golang)

//NOTE: for simplicity, error check is omitted
func uploadLargeFile(uri, filePath string, chunkSize int, params map[string]string) {
    //open file and retrieve info
    file, _ := os.Open(filePath)
    fi, _ := file.Stat()
    defer file.Close()    

    //buffer for storing multipart data
    byteBuf := &bytes.Buffer{}

    //part: parameters
    mpWriter := multipart.NewWriter(byteBuf)
    for key, value := range params {
        _ = mpWriter.WriteField(key, value)
    }

    //part: file
    mpWriter.CreateFormFile("file", fi.Name())
    contentType := mpWriter.FormDataContentType()

    nmulti := byteBuf.Len()
    multi := make([]byte, nmulti)
    _, _ = byteBuf.Read(multi)    

    //part: latest boundary
    //when multipart closed, latest boundary is added
    mpWriter.Close()
    nboundary := byteBuf.Len()
    lastBoundary := make([]byte, nboundary)
    _, _ = byteBuf.Read(lastBoundary)

    //calculate content length
    totalSize := int64(nmulti) + fi.Size() + int64(nboundary)
    log.Printf("Content length = %v byte(s)\n", totalSize)

    //use pipe to pass request
    rd, wr := io.Pipe()
    defer rd.Close()

    go func() {
        defer wr.Close()

        //write multipart
        _, _ = wr.Write(multi)

        //write file
        buf := make([]byte, chunkSize)
        for {
            n, err := file.Read(buf)
            if err != nil {
                break
            }
            _, _ = wr.Write(buf[:n])
        }        
        //write boundary
        _, _ = wr.Write(lastBoundary)        
    }()

    //construct request with rd
    req, _ := http.NewRequest("POST", uri, rd)
    req.Header.Set("Content-Type", contentType)
    req.ContentLength = totalSize

    //process request
    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        log.Fatal(err)
    } else {
        log.Println(resp.StatusCode)
        log.Println(resp.Header)

        body := &bytes.Buffer{}
        _, _ = body.ReadFrom(resp.Body)
        resp.Body.Close()
        log.Println(body)
    }
}

Upvotes: 13

Related Questions