Reputation: 97
I am trying to get the link to an image from an urllib.request response.
I am trying to get content from this page: https://drscdn.500px.org/photo/27428737/m%3D900/v2?webp=true&sig=3d3700c82ea515ecc0b66ca265d6909d67861fbe055c0e817b535f75b21c7ebf and decode it but the decode("utf-8") method gives me the error: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte. I've already checked for the page encoding using document.characterSet in the browser console and it matches the utf-8 encoding.
def ex4():
url = sys.argv[1]
r = re.compile(b"<img .*? src=\"([^\"])*\" (.*?)*>")
try:
resource = urllib.request.urlopen(url)
response = resource.read().decode("utf-8")
print(response)
obj = r.search(response)
if obj:
print(obj.group(1))
else:
print("not found")
except Exception as e:
print("error: ", e)
ex4()
Upvotes: 0
Views: 445
Reputation: 1815
What do you try to achieve? Get the image and save it fo file? If yes just keep it in file
def ex4():
url = sys.argv[1]
try:
resource = urllib.request.urlopen(url)
response = resource.read()
with open('img.png', 'wb') as f:
f.write(a)
except Exception as e:
print("error: ", e)
ex4()
Upvotes: 0
Reputation: 18106
You are served the binary image, so you can directly save or process the image.
For example:
url = 'https://drscdn.500px.org/photo/27428737/m%3D900/v2?webp=true&sig=3d3700c82ea515ecc0b66ca265d6909d67861fbe055c0e817b535f75b21c7ebf'
resource = urllib.request.urlopen(url)
response = resource.read()
with open('/tmp/foo.jpg', 'wb') as f:
f.write(response)
Upvotes: 1