fpghost
fpghost

Reputation: 2944

How can I request a URL that is already URL-encoded in python-requests?

I'm trying to request the following URL:

https://www.sainsburys.co.uk/shop/gb/groceries/shiraz/barossa-valley-estate-grenache-shiraz-mourv%C3%A8dre-75cl

Decoding it with urllib and printing it reveals it to be:

In [36]: print urllib.unquote(url)
https://www.sainsburys.co.uk/shop/gb/groceries/shiraz/barossa-valley-estate-grenache-shiraz-mourvèdre-75cl

i.e. an accented "e".

But it seems no matter what I request with import requests; requests.get(...) then I get a 404.

What is the proper input to give to the get method?

Upvotes: 1

Views: 320

Answers (1)

asm
asm

Reputation: 58

you should decode the url with 'latin-1' after passing it to urrlib unquote

>>> 
>>> k = "https://www.sainsburys.co.uk/shop/gb/groceries/shiraz/barossa-valley-estate-grenache-shiraz-mourv%C3%A8dre-75cl"
>>> r = requests.get(urllib.unquote(k).decode("latin-1"))
>>> r.status_code
200
>>> 

Upvotes: 1

Related Questions