Reputation: 73187

Use HTTP/1.1 with SimpleHTTPRequestHandler

When I use HTTP/1.1 with SimpleHTTPRequestHandler, loading a page that pulls in other resources will hang after the second resource.

Here is a small reproducer:

from SimpleHTTPServer import SimpleHTTPRequestHandler
from BaseHTTPServer import HTTPServer

class MyRequestHandler(SimpleHTTPRequestHandler):
    #protocol_version = "HTTP/1.0"   # works
    protocol_version = "HTTP/1.1"   # hangs

server = HTTPServer(("localhost", 7080), MyRequestHandler)
server.serve_forever()

With the above server, the following HTML will hang when the browser tries to load b.png:

<html>
    <body>
        <img src="a.png">
        <img src="b.png">
    </body>
</html>

Can HTTP/1.1 be used with the SimpleHTTPServer module and if so, how? Note that adding ForkingMixIn or ThreadingMixIn to the server will allow things to progress, however, it seems that it should be possible without either of those mixins.

Upvotes: 5

Answers (1)

Piotr Dobrogost

Reputation: 42445

The behavior you see is due to three reasons:

BaseHTTPServer.HTTPServer by default is capable of handling only one request at a time
most user agents (browsers) open more than one connection to any given host at a time
most user agents use keep-alive feature of HTTP 1.1 and do not close a connection immediately after they received requested entities

What you see is that browser is able to get all entities requested using the first connection it opens to the server. This is the page itself and possibly some of its resources. At the same time browser opens additional connections to get the rest of resources but these connections can't proceed because the server is tied with the first one. The reason server is tied with the first one is that although browser had already received entities requested using this connection it does not close it immediately in case it could be reused to get more entities in the near future (the server does not close the connection on its side either as the browser specified version 1.1 of HTTP and sent Connection: keep-alive header). Only after the first connection times out, the server starts handling the next waiting connection so additional resource(s) are being downloaded (all which were requested using this specific connection). If you wait long enough the browser manages to get all resources. You can observe the difference when you set network.http.max-persistent-connections-per-server preference to 1 (instead the default 6) in Firefox or analogous in other browsers. Then, as all resources are requested using the same connection, retrieval of each starts as soon as the previous was retrieved without any delay.

I'd like to thank marienz from #python IRC channel on freenode.net for his help with this problem.

Upvotes: 7

Use HTTP/1.1 with SimpleHTTPRequestHandler

Answers (1)

Related Questions