Goir
Goir

Reputation: 71

How to change response and content-length in uwsgi middleware?

I'm trying to write a middleware which replaces some data in the response thus changing the content length. For our development environment we want to simulate the behaviour of SSI includes of the actual webserver like Nginx or Apache for some static files which are not served through the application. We're using the included development server of werkzeug.

Here is what i have so far:

class ModifyBodyMiddleware(object):
    def __init__(self, app):
        self.app = app

    def __call__(self, environment, start_response):
        def my_start_response(status, headers, exc_info=None):
            # change content-length somehow
            start_response(status, headers, exc_info)

        body = self.app(environment, my_start_response)
        body = do_modifications(body)

        return body

For simplification just assume do_modifications does replace the whole content with foobar. I need the actual body to modify it but i also need to set the new content-length header somehow.

Thanks Goir

Upvotes: 0

Views: 1415

Answers (2)

Goir
Goir

Reputation: 71

Ok i found a solution, instead of adding another Middleware I just overwrite the SharedDataMiddleware and modify the file when its read.

EDIT: Added recursive calls to include file in included files. EDIT2: added support for #echo SSI

        class SharedDataSSIMiddleware(SharedDataMiddleware):
    """ Replace SSI includes with the real files on request
    """
    ssi_incl_expr = re.compile(r'<!-- *# *include *(virtual|file)=[\'\"]([^\'"]+)[\'\"] *-->')
    ssi_echo_expr = re.compile(r'<!-- *# *echo *encoding=[\'\"]([^\'"]+)[\'\"] *var=[\'\"]([^\'"]+)[\'\"] *-->')

    def __init__(self, app, exports, disallow=None, cache=True, cache_timeout=60 * 60 * 12, fallback_mimetype='text/plain'):
        super(SharedDataSSIMiddleware, self).__init__(app, exports, disallow, cache, cache_timeout, fallback_mimetype)

        self.environment = None

    def get_included_content(self, path_info, path):
        full_path = os.path.join(path_info, path)
        with open(full_path) as fp:
            data = fp.read()
            return self._ssi_include(full_path, data)

    def _get_ssi_echo_value(self, encoding, var_name):
        return self.environment.get(var_name)

    def _ssi_include(self, filename, content):
        content = re.sub(
            self.ssi_incl_expr,
            lambda x: self.get_included_content(os.path.dirname(filename), x.groups()[1]),
            content
        )
        content = re.sub(
            self.ssi_echo_expr,
            lambda x: self._get_ssi_echo_value(*x.groups()),
            content
        )
        return content

    def _opener(self, filename):
        file = cStringIO.StringIO()
        with open(filename, 'rb') as fp:
            content = fp.read()
            content = self._ssi_include(filename, content)
            file.write(content)
            file.flush()
            size = file.tell()
        file.reset()

        return lambda: (file, datetime.utcnow(), size)

    def __call__(self, environ, start_response):
        self.environment = environ
        response = super(SharedDataSSIMiddleware, self).__call__(environ, start_response)
        self.environment = None
        return response

This reads the actual file, modifies it and returns a StringIO object with the modified data instead of the actual file. Don't use static_files parameter in run_simple of werkzeug, this will just add the default SharedDataMiddleware we don't want here.

just wrap your app with the Middleware above:

app = SharedDataSSIMiddleware(app, exports={'/foo': 'path'})

Upvotes: 0

Graham Dumpleton
Graham Dumpleton

Reputation: 58563

Where in the content do you want to make the modifications? Should modifications only be done for certain response content types?

This sort of thing can get complicated. In the simplest case you would delay calling the server start_response() in your middleware until you have buffered up the complete response in memory so you can modify it and calculate the new response header for content length. This though will cause problems if you are returning very large responses, or streaming responses.

If dealing only with HTML and need to make a change in the <head> only, then you can use a mechanism which buffers, but only buffers until it sees <body>, or as failsafe, a certain number of bytes have been buffered. If you expect to insert anything just before </body> then you can't avoid buffering everything, which is typically bad.

The big question is what are you actually trying to do this for. If this was known then it may be possible to provide a better answer or guide you in a different direction as to what to do.


UPDATE 1

FWIW. If you were using mod_wsgi-express, all you would need to do is add the additional --include-file option with argument of ssi.conf and in the ssi.conf configuration file snippet add:

LoadModule filter_module ${MOD_WSGI_MODULES_DIRECTORY}/mod_filter.so
LoadModule include_module ${MOD_WSGI_MODULES_DIRECTORY}/mod_include.so

<Location />
Options +Includes
AddOutputFilterByType INCLUDES text/html
</Location>

If the response content type was text/html it would then be passed through the Apache INCLUDES filter and expanded appropriately.

Thus you could make use of:

If the intent is to eventually target SSI mechanisms of Apache in production, then this would give you a more reliable result as mod_wsgi-express is still using Apache to do the heavy lifting.

Upvotes: 1

Related Questions