requests vs. request futures - response times inaccurate?

Question

I have Python code that runs 10 GET requests and measures the response time:

from datetime import datetime
from requests_futures.sessions import FuturesSession
import requests

class CustomSession(FuturesSession):

    def __init__(self, *args, **kwargs):
        super(CustomSession, self).__init__(*args, **kwargs)
        self.timing = {}
        self.timing = {}

    def request(self, method, url, *args, **kwargs):
        background_callback = kwargs.pop('background_callback', None)
        test_id = kwargs.pop('test_id', None)

        # start counting
        self.timing[test_id] = {}
        self.timing[test_id]['cS'] = datetime.now()

        def time_it(sess, resp):
            # here if you want to time the server stuff only
            self.timing[test_id]['cE'] = datetime.now()
            if background_callback:
                background_callback(sess, resp)
            # here if you want to include any time in the callback

        return super(CustomSession, self).request(method, url, *args,
                                                  background_callback=time_it,
                                                  **kwargs)

# using requests-futures

print('requests-futures:')

session = CustomSession()

futures = []
for i in range(10):

    futures.append(session.get('http://google.com/', test_id=i))
for future in futures:
    try:
        r = future.result()
        #print((session.timing[i]['cE'] - session.timing[i]['cS']))
    except Exception as e:
        print(e)
for i in range(10):
    print((session.timing[i]['cE'] - session.timing[i]['cS']).total_seconds() * 1000)


# using requests

print('requests:')

for i in range(10):

    check_start_timestamp = datetime.utcnow()
    r = requests.get('http://google.com')
    check_end_timestamp = datetime.utcnow()
    cE = int((check_end_timestamp - check_start_timestamp).total_seconds() * 1000)
    print(cE)

requests-futures:

112.959
118.627
160.139
174.32
214.399
224.295
267.557
276.582
316.824
327.00800000000004

requests:

It appears that:

The response times with requests-futures appear additive (the times are getting bigger)
Using plain requests runs drastically faster.

Is this normal? Am I missing something that would cause the discrepancy?

matsjoyce · Accepted Answer

Question 1

The response times with requests-futures appear additive (the times are getting bigger)

The reason is that requests_futures uses a thread pool behind the scenes. You can see this as the timing occur in chunks (separators added for clarity, the number of thread can be changed by the max_workers argument):

Default pool size of 2:

161.226
172.41600000000003
---
250.141
253.18600000000004
---
329.32800000000003
342.71000000000004
---
408.21200000000005
420.614
---
487.356
499.311

Pool size of 4:

Graph (blue is 2, red is 4):

As you can see, the group occur with roughly the same spacing, which should be the response time for one request.

Theoretically, putting the pool size to 10 gives optimal results for your test, giving results like:

252.977
168.379
161.689
165.44
169.238
157.929
171.77
154.089
168.283
159.23999999999998

However, then the effects below have more of an effect.

Question 2

Using plain requests runs drastically faster.

I can't be sure, but looking at the timing for the first request batch, its only ~15 units (microseconds?) off. This could be due to:

Thread switching. Since normal requests requesting occurs in the same thread as the requester, the job starts straight away. For the thread pool, the request is only started when the OS switches to the correct thread. This will incur a time overhead.
Polling. The futures may use some kind of polling to check for the result, so there may be a delay there, too.

The advantage of the futures is that the total time for 10 requests is lower, not the individual time, so this slight difference is not really a concern.

requests vs. request futures - response times inaccurate?

Answers (1)

Question 1

Question 2

Related Questions