hlongmore
hlongmore

Reputation: 1846

Why does server response body include extra characters inserted before every character? (Apache2 / mod_wsgi / python 2.7)

I have a python wsgi application running under apache2 via mod_wsgi 3.3, on Ubuntu 12.04.4 LTS.

class MyGateway(object):
    # ...
    def __call__(self, environ, start_response):
        environ['gateway.errors'] = []
        return self.gateway(environ, start_response)
    # ...

if __name__ == '__main__':
    # Code to run outside of apache/mod_wsgi
else:
    g = MyGateway()
    application = g

It has been running fine for more than half a year, with no interaction required and no changes to the code or the configuration. This morning, tech support started getting calls that it was not working. When I looked into it, I found that the expected results were getting extra characters inserted. For example, if incorrect credentials were used, the correct body of the server response would be:

<b>Incorrect username or password.</b>

Instead, I am getting:

1
<
1
b
1
>
1
I
1
n
1
c
1
o
1
r
1
r
1
e
1
c
1
t
1

1
u
1
s
1
e
1
r
1
n
1
a
1
m
1
e
1

1
o
1
r
1

1
p
1
a
1
s
1
s
1
w
1
o
1
r
1
d
1
.
1
<
1
/
1
b
1
>
0

(I would have loved to collapse that about 12 lines down, but the powers that be declined that feature request.)

We tried:

None of these resolved the issue. I added logging to the python code to see whether the correct values were being returned by the python script (which would also help determine whether an external dependency was broken), and found that right up until the point where the __call__ method returns, I have the correct value:

def __call__(self, environ, start_response):
    environ['gateway.errors'] = []
    response = self.gateway(environ, start_response)
    log.write('response to be returned: <%s>' % (response), log.DEBUG)
    # The above results in
    # "response to be returned: <<b>Incorrect username or password.</b>>" in the
    # log file for invalid credentials.
    return response

I found this SO question with similar symptoms, and the resolution was to upgrade mod_wsgi. I looked into doing so, but apt-get tells me I am on the latest version, so short of compiling it myself, that will require upgrading to Ubuntu 14.x.

So, I my question is threefold:

  1. Does anyone have better ideas as to what the likely cause of this corrupted server response is?
  2. If the restored machine exhibits the same behavior, but there were no issues prior to today, that would suggest some other machine has the problem (using Occam's razor to filter out other possibilities). However, no other network traffic is having problems with garbage data. What should I try to rule out next?
  3. The accepted answer and its comments for this serverfault question imply that I will need to install yet another module to enabling logging of the response in apache and yet that still might not get me the response body (the not-accepted answer that came 9 months later indicates it could work, but I see no response from anyone saying it does). Does anyone have more information about how to get the resposne body logged? Another possibility would be to install Wireshark on Ubuntu (I've only used it on Windows thus far) to see if the response is leaving the machine correctly but getting corrupted elsewhere, but is there a more common tool people use on Ubuntu, or a simpler way to check the response internal to the machine?

Upvotes: 0

Views: 399

Answers (1)

Graham Dumpleton
Graham Dumpleton

Reputation: 58523

Two things.

The first is that it looks like you are returning a string from your WSGI application instead of an iterable (e.g. list) of strings. This is resulting in each single character being sent one at at time, which is absolutely dreadful for performance. So don't return a string, but a list containing a single string.

The second is that in combination with returning a string with each character being sent one at a time, you have no content length in the response headers. As a result, Apache is using chunked encoding for the response content. The 1's in the output are actually part of the chunked request encoding, which suggests that whatever client you are using is not dealing with chunked encoding properly. So ensure you also set a response content length.

Upvotes: 1

Related Questions