PMvn
PMvn

Reputation: 53

Python: grequest and request give different response

My original task: Using Trello API, get data through HTTP GET requests. Run requests and process responses asynchronously, if possible. The API provider uses "https://" URL I access to with some key and token.

Tools I used:

Original task result: Only requests library worked, I've got Trello API's response, great. grequests library was failing with status_code = 302.

I tried to understand why it happens and wrote two reproducible scripts.

Script A : requests library used:

import requests

urls = [
    "https://www.google.com",
    "https://www.facebook.com/",
    "http://www.facebook.com/",
    "http://www.google.com",
    "http://fakedomain/",
    "http://python-tablib.org"
]

# Run requests:
for url in urls:
    print requests.get(url).status_code

Console output A (having some exception because of http://fakedomain/) :

200
200
200
200
Traceback (most recent call last):
  File "req.py", line 14, in <module>
    print requests.get(url).status_code
  File "D:\python\lib\site-packages\requests\api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "D:\python\lib\site-packages\requests\api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "D:\python\lib\site-packages\requests\sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "D:\python\lib\site-packages\requests\sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "D:\python\lib\site-packages\requests\adapters.py", line 415, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', gaierror(11001, 'getaddrinfo failed'))

Script B : grequests library used with map to send asynchronous requests:

import grequests

# This function will execute set of instructions when responses come:
def proc_response(response, **kwargs):
    # do something ..
    print response

# Request exception handler:
def my_except_handler(request, excetion):
    print "Request failed : " + request.url

urls = [
    "https://www.google.com",
    "https://www.facebook.com/",
    "http://www.facebook.com/",
    "http://www.google.com",
    "http://fakedomain/",
    "http://python-tablib.org"
]
# Here is the list of tasks we build and run in parallel later:
actions_list = []

# Tasks list building:
for url in urls:
    action_item = grequests.get(url, hooks = {'response' : proc_response})
    actions_list.append(action_item)

# Run grequests:
print grequests.map(actions_list, exception_handler=my_except_handler)

Console output B :

<Response [302]>
<Response [302]>
<Response [200]>
<Response [301]>
<Response [302]>
<Response [200]>
Request failed : https://www.google.com
Request failed : https://www.facebook.com/
Request failed : http://www.facebook.com/
Request failed : http://fakedomain/
[None, None, None, <Response [200]>, None, <Response [200]>]

All I can conclude based on this information and my relatively small experience is the following - because of some reason grequests is rejected by remote websites requests works normally with. As long as 302 means redirection of some kind, it seems that grequests can not get data from source it is redirected to when requests can. allow_redirects=True in get method in Script B didn't solve the issue.

I wonder why libraries give different response. It is possible that I miss something, and these two scripts have to return different results by design, not because of differences between two libraries.

Thanks for your help in advance.

Upvotes: 2

Views: 2488

Answers (1)

Jan Vlcinsky
Jan Vlcinsky

Reputation: 44112

grequests works well for me

Here is my script b.py, which I run via $ py.test -sv b.py:

import pytest
import grequests


@pytest.fixture
def urls():
    return [
        "https://www.google.com",
        "https://www.facebook.com/",
        "http://www.facebook.com/",
        "http://www.google.com",
        "http://fakedomain/",
        "http://python-tablib.org"
    ]


# This function will execute set of instructions when responses come:
def proc_response(response, **kwargs):
    # do something ..
    print "========Processing response=============", response.request.url
    print response
    if response.status_code != 200:
        print response.request.url
        print response.content


# Request exception handler:
def my_except_handler(request, exception):
    print "Request failed : " + request.url
    print request.response


def test_it(urls):
    # Here is the list of tasks we build and run in parallel later:
    actions_list = []

    # Tasks list building:
    for url in urls:
        action_item = grequests.get(url, hooks={'response': proc_response})
        actions_list.append(action_item)

    # Run grequests:
    print grequests.map(actions_list, exception_handler=my_except_handler)

It is based on your code, it is only rewritten to easy my experimentation.

Results: Final results are 200 or None

Last printout of my test shows:

[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, None, <Response [200]>]

This is what is expected.

Note, that you could have some temporary problems with fetching the data, there are too many players participating.

Conclusion: different response processing confused you

The difference is, that with requests you are asking for final result while with grequests you deploy process_response hook, which is called for each response including redirect ones.

The requests processing goes through redirect too, but this temporary response is not reported.

Upvotes: 4

Related Questions