hoju
hoju

Reputation: 29472

Python standard library to POST multipart/form-data encoded data

I would like to POST multipart/form-data encoded data. I have found an external module that does it: http://atlee.ca/software/poster/index.html however I would rather avoid this dependency. Is there a way to do this using the standard libraries?

thanks

Upvotes: 22

Views: 17077

Answers (4)

Jaymon
Jaymon

Reputation: 5738

You can use python's standard library email module. I'm not sure when the email module got this functionality but the python I tested this on was 3.10.14.

import email.parser
import email.mime.multipart
import email.mime.text
import email.mime.base
import mimetypes
import os

def encode_multipart(fields, files, charset=None):
    multipart_data = email.mime.multipart.MIMEMultipart("form-data")

    # Add form fields
    for key, value in fields.items():
        part = email.mime.text.MIMEText(str(value), "plain", _charset=charset)
        part.add_header("Content-Disposition", f"form-data; name=\"{key}\"")
        multipart_data.attach(part)

    # Add files
    for key, fp in files.items():
        mimetype = mimetypes.guess_type(fp.name)[0]
        maintype, subtype = mimetype.split("/", maxsplit=1)
        basename = os.path.basename(fp.name)
        part = email.mime.base.MIMEBase(maintype, subtype)
        part.set_payload(fp.read())

        part.add_header(
            "Content-Disposition",
            f"form-data; name=\"{key}\";filename=\"{basename}\""
        )
        email.encoders.encode_base64(part)
        multipart_data.attach(part)

    headerbytes, body = multipart_data.as_bytes().split(b"\n\n", 1)
    hp = email.parser.BytesParser().parsebytes(headerbytes, headersonly=True)

    return hp._headers, body

encode_multipart will return the request headers and the request multipart/form-data body the client can send up to the server, you would use it like this:

with open("<SOME-FILEPATH>") as fp:
    fields = {
        "foo": 1,
        "bar": "two"
    }
    files = {
        "file-key": fp
    }

    request_headers, request_body = encode_multipart(fields, files)
    print(request_headers)
    print(request_body)

I'm not verifying the data or handling any errors but this should be enough to get people started.

Upvotes: 0

Martin v. L&#246;wis
Martin v. L&#246;wis

Reputation: 127527

The standard library does not currently support that. There is cookbook recipe that includes a fairly short piece of code that you just may want to copy, though, along with long discussions of alternatives.

Upvotes: 17

ticapix
ticapix

Reputation: 1762

It's an old thread but still a popular one, so here is my contribution using only standard modules.

The idea is the same than here but support Python 2.x and Python 3.x. It also has a body generator to avoid unnecessarily memory usage.

import codecs
import mimetypes
import sys
import uuid
try:
    import io
except ImportError:
    pass # io is requiered in python3 but not available in python2

class MultipartFormdataEncoder(object):
    def __init__(self):
        self.boundary = uuid.uuid4().hex
        self.content_type = 'multipart/form-data; boundary={}'.format(self.boundary)

    @classmethod
    def u(cls, s):
        if sys.hexversion < 0x03000000 and isinstance(s, str):
            s = s.decode('utf-8')
        if sys.hexversion >= 0x03000000 and isinstance(s, bytes):
            s = s.decode('utf-8')
        return s

    def iter(self, fields, files):
        """
        fields is a sequence of (name, value) elements for regular form fields.
        files is a sequence of (name, filename, file-type) elements for data to be uploaded as files
        Yield body's chunk as bytes
        """
        encoder = codecs.getencoder('utf-8')
        for (key, value) in fields:
            key = self.u(key)
            yield encoder('--{}\r\n'.format(self.boundary))
            yield encoder(self.u('Content-Disposition: form-data; name="{}"\r\n').format(key))
            yield encoder('\r\n')
            if isinstance(value, int) or isinstance(value, float):
                value = str(value)
            yield encoder(self.u(value))
            yield encoder('\r\n')
        for (key, filename, fd) in files:
            key = self.u(key)
            filename = self.u(filename)
            yield encoder('--{}\r\n'.format(self.boundary))
            yield encoder(self.u('Content-Disposition: form-data; name="{}"; filename="{}"\r\n').format(key, filename))
            yield encoder('Content-Type: {}\r\n'.format(mimetypes.guess_type(filename)[0] or 'application/octet-stream'))
            yield encoder('\r\n')
            with fd:
                buff = fd.read()
                yield (buff, len(buff))
            yield encoder('\r\n')
        yield encoder('--{}--\r\n'.format(self.boundary))

    def encode(self, fields, files):
        body = io.BytesIO()
        for chunk, chunk_len in self.iter(fields, files):
            body.write(chunk)
        return self.content_type, body.getvalue()

Demo

# some utf8 key/value pairs
fields = [('প্রায়', 42), ('bar', b'23'), ('foo', 'ން:')]
files = [('myfile', 'image.jpg', open('image.jpg', 'rb'))]

# iterate and write chunk in a socket
content_type, body = MultipartFormdataEncoder().encode(fields, files)

Upvotes: 15

ars
ars

Reputation: 123558

You can't do this with the stdlib quickly. Howevewr, see the MultiPartForm class in this PyMOTW. You can probably use or modify that to accomplish whatever you need:

Upvotes: 6

Related Questions