Reputation: 71
I'm trying to write a middleware which replaces some data in the response thus changing the content length. For our development environment we want to simulate the behaviour of SSI includes of the actual webserver like Nginx or Apache for some static files which are not served through the application. We're using the included development server of werkzeug.
Here is what i have so far:
class ModifyBodyMiddleware(object):
def __init__(self, app):
self.app = app
def __call__(self, environment, start_response):
def my_start_response(status, headers, exc_info=None):
# change content-length somehow
start_response(status, headers, exc_info)
body = self.app(environment, my_start_response)
body = do_modifications(body)
return body
For simplification just assume do_modifications
does replace the whole content with foobar
. I need the actual body to modify it but i also need to set the new content-length header somehow.
Thanks Goir
Upvotes: 0
Views: 1415
Reputation: 71
Ok i found a solution, instead of adding another Middleware I just overwrite the SharedDataMiddleware and modify the file when its read.
EDIT: Added recursive calls to include file in included files. EDIT2: added support for #echo SSI
class SharedDataSSIMiddleware(SharedDataMiddleware):
""" Replace SSI includes with the real files on request
"""
ssi_incl_expr = re.compile(r'<!-- *# *include *(virtual|file)=[\'\"]([^\'"]+)[\'\"] *-->')
ssi_echo_expr = re.compile(r'<!-- *# *echo *encoding=[\'\"]([^\'"]+)[\'\"] *var=[\'\"]([^\'"]+)[\'\"] *-->')
def __init__(self, app, exports, disallow=None, cache=True, cache_timeout=60 * 60 * 12, fallback_mimetype='text/plain'):
super(SharedDataSSIMiddleware, self).__init__(app, exports, disallow, cache, cache_timeout, fallback_mimetype)
self.environment = None
def get_included_content(self, path_info, path):
full_path = os.path.join(path_info, path)
with open(full_path) as fp:
data = fp.read()
return self._ssi_include(full_path, data)
def _get_ssi_echo_value(self, encoding, var_name):
return self.environment.get(var_name)
def _ssi_include(self, filename, content):
content = re.sub(
self.ssi_incl_expr,
lambda x: self.get_included_content(os.path.dirname(filename), x.groups()[1]),
content
)
content = re.sub(
self.ssi_echo_expr,
lambda x: self._get_ssi_echo_value(*x.groups()),
content
)
return content
def _opener(self, filename):
file = cStringIO.StringIO()
with open(filename, 'rb') as fp:
content = fp.read()
content = self._ssi_include(filename, content)
file.write(content)
file.flush()
size = file.tell()
file.reset()
return lambda: (file, datetime.utcnow(), size)
def __call__(self, environ, start_response):
self.environment = environ
response = super(SharedDataSSIMiddleware, self).__call__(environ, start_response)
self.environment = None
return response
This reads the actual file, modifies it and returns a StringIO object with the modified data instead of the actual file.
Don't use static_files
parameter in run_simple
of werkzeug, this will just add the default SharedDataMiddleware we don't want here.
just wrap your app with the Middleware above:
app = SharedDataSSIMiddleware(app, exports={'/foo': 'path'})
Upvotes: 0
Reputation: 58563
Where in the content do you want to make the modifications? Should modifications only be done for certain response content types?
This sort of thing can get complicated. In the simplest case you would delay calling the server start_response()
in your middleware until you have buffered up the complete response in memory so you can modify it and calculate the new response header for content length. This though will cause problems if you are returning very large responses, or streaming responses.
If dealing only with HTML and need to make a change in the <head>
only, then you can use a mechanism which buffers, but only buffers until it sees <body>
, or as failsafe, a certain number of bytes have been buffered. If you expect to insert anything just before </body>
then you can't avoid buffering everything, which is typically bad.
The big question is what are you actually trying to do this for. If this was known then it may be possible to provide a better answer or guide you in a different direction as to what to do.
UPDATE 1
FWIW. If you were using mod_wsgi-express, all you would need to do is add the additional --include-file
option with argument of ssi.conf
and in the ssi.conf
configuration file snippet add:
LoadModule filter_module ${MOD_WSGI_MODULES_DIRECTORY}/mod_filter.so
LoadModule include_module ${MOD_WSGI_MODULES_DIRECTORY}/mod_include.so
<Location />
Options +Includes
AddOutputFilterByType INCLUDES text/html
</Location>
If the response content type was text/html
it would then be passed through the Apache INCLUDES
filter and expanded appropriately.
Thus you could make use of:
If the intent is to eventually target SSI mechanisms of Apache in production, then this would give you a more reliable result as mod_wsgi-express is still using Apache to do the heavy lifting.
Upvotes: 1