python url request response decoding

Question

I am trying to get the link to an image from an urllib.request response.

I am trying to get content from this page: https://drscdn.500px.org/photo/27428737/m%3D900/v2?webp=true&sig=3d3700c82ea515ecc0b66ca265d6909d67861fbe055c0e817b535f75b21c7ebf and decode it but the decode("utf-8") method gives me the error: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte. I've already checked for the page encoding using document.characterSet in the browser console and it matches the utf-8 encoding.

def ex4():
    url = sys.argv[1]
    r = re.compile(b"")
    try:
        resource = urllib.request.urlopen(url)
        response = resource.read().decode("utf-8")
        print(response)
        obj = r.search(response)
        if obj:
            print(obj.group(1))
        else:
            print("not found")
    except Exception as e:
        print("error: ", e)


ex4()

Maurice Meyer · Accepted Answer

You are served the binary image, so you can directly save or process the image.
For example:

url = 'https://drscdn.500px.org/photo/27428737/m%3D900/v2?webp=true&sig=3d3700c82ea515ecc0b66ca265d6909d67861fbe055c0e817b535f75b21c7ebf'
resource = urllib.request.urlopen(url)
response = resource.read()

with open('/tmp/foo.jpg', 'wb') as f:
    f.write(response)

python url request response decoding

Answers (2)

Related Questions