rubik
rubik

Reputation: 9104

How to speed-up a HTTP request

I need to get json data and I'm using urllib2:

request = urllib2.Request(url)
request.add_header('Accept-Encoding', 'gzip')
opener = urllib2.build_opener()
connection = opener.open(request)
data = connection.read()

but although the data aren't so big it is too slow.
Is there a way to speed it up? I can use 3rd party libraries too.

Upvotes: 4

Views: 7846

Answers (4)

Senthil Kumaran
Senthil Kumaran

Reputation: 56813

Accept-Encoding:gzip means that the client is ready to gzip Encoded content if the Server is ready to send it first. The rest of the request goes down the sockets and to over your Operating Systems TCP/IP stack and then to physical layer.

If the Server supports ETags, then you can send a If-None-Match header to ensure that content has not changed and rely on the cache. An example is given here.

You cannot do much with clients only to improve your HTTP request speed.

Upvotes: 6

wisty
wisty

Reputation: 7061

If you are making lots of requests, look into threading. Having about 10 workers making requests can speed things up - you don't grind to a halt if one of them takes too long getting a connection.

Upvotes: 0

user2665694
user2665694

Reputation:

There is unlikely an issue with urllib. If you have network issues and performance problems: consider using tools like Wireshark to investigate on the network level. I have very strong doubts that this is related to Python in any way.

Upvotes: 1

Richard Nienaber
Richard Nienaber

Reputation: 10554

You're dependant on a number of different things here that may not be within your control:

  1. Latency/Bandwidth of your connection
  2. Latency/Bandwidth of server connection
  3. Load of server application and its individual processes

Items 2 and 3 are probably where the problem lies and you won't be able to do much about it. Is the content cache-able? This will depend on your own application needs and HTTP headers (e.g. ETags, Cache-Control, Last-Modified) that are returned from the server. The server may only up date every day in which case you might be better off only requesting data every hour.

Upvotes: 3

Related Questions