okoboko
okoboko

Reputation: 4482

requests vs. request futures - response times inaccurate?

I have Python code that runs 10 GET requests and measures the response time:

from datetime import datetime
from requests_futures.sessions import FuturesSession
import requests

class CustomSession(FuturesSession):

    def __init__(self, *args, **kwargs):
        super(CustomSession, self).__init__(*args, **kwargs)
        self.timing = {}
        self.timing = {}

    def request(self, method, url, *args, **kwargs):
        background_callback = kwargs.pop('background_callback', None)
        test_id = kwargs.pop('test_id', None)

        # start counting
        self.timing[test_id] = {}
        self.timing[test_id]['cS'] = datetime.now()

        def time_it(sess, resp):
            # here if you want to time the server stuff only
            self.timing[test_id]['cE'] = datetime.now()
            if background_callback:
                background_callback(sess, resp)
            # here if you want to include any time in the callback

        return super(CustomSession, self).request(method, url, *args,
                                                  background_callback=time_it,
                                                  **kwargs)

# using requests-futures

print('requests-futures:')

session = CustomSession()

futures = []
for i in range(10):

    futures.append(session.get('http://google.com/', test_id=i))
for future in futures:
    try:
        r = future.result()
        #print((session.timing[i]['cE'] - session.timing[i]['cS']))
    except Exception as e:
        print(e)
for i in range(10):
    print((session.timing[i]['cE'] - session.timing[i]['cS']).total_seconds() * 1000)


# using requests

print('requests:')

for i in range(10):

    check_start_timestamp = datetime.utcnow()
    r = requests.get('http://google.com')
    check_end_timestamp = datetime.utcnow()
    cE = int((check_end_timestamp - check_start_timestamp).total_seconds() * 1000)
    print(cE)

requests-futures:

112.959
118.627
160.139
174.32
214.399
224.295
267.557
276.582
316.824
327.00800000000004

requests:

99
104
92
110
100
126
140
112
102
107

It appears that:

  1. The response times with requests-futures appear additive (the times are getting bigger)
  2. Using plain requests runs drastically faster.

Is this normal? Am I missing something that would cause the discrepancy?

Upvotes: 3

Views: 1992

Answers (1)

matsjoyce
matsjoyce

Reputation: 5844

Question 1


The response times with requests-futures appear additive (the times are getting bigger)

The reason is that requests_futures uses a thread pool behind the scenes. You can see this as the timing occur in chunks (separators added for clarity, the number of thread can be changed by the max_workers argument):

  • Default pool size of 2:

    161.226
    172.41600000000003
    ---
    250.141
    253.18600000000004
    ---
    329.32800000000003
    342.71000000000004
    ---
    408.21200000000005
    420.614
    ---
    487.356
    499.311
    
  • Pool size of 4:

    149.781
    154.761
    151.971
    155.385
    ---
    225.458
    230.596
    239.784
    240.386
    ---
    313.801
    314.056
    
  • Graph (blue is 2, red is 4):

    enter image description here As you can see, the group occur with roughly the same spacing, which should be the response time for one request.

Theoretically, putting the pool size to 10 gives optimal results for your test, giving results like:

252.977
168.379
161.689
165.44
169.238
157.929
171.77
154.089
168.283
159.23999999999998

However, then the effects below have more of an effect.

Question 2


Using plain requests runs drastically faster.

I can't be sure, but looking at the timing for the first request batch, its only ~15 units (microseconds?) off. This could be due to:

  • Thread switching. Since normal requests requesting occurs in the same thread as the requester, the job starts straight away. For the thread pool, the request is only started when the OS switches to the correct thread. This will incur a time overhead.
  • Polling. The futures may use some kind of polling to check for the result, so there may be a delay there, too.

The advantage of the futures is that the total time for 10 requests is lower, not the individual time, so this slight difference is not really a concern.

Upvotes: 4

Related Questions