Reputation: 3
I am trying to download an image from the NGA.gov site using python 3 and urllib.
The site does not display images in a standard .jpg fashion and i get an error.
import urllib.request
from bs4 import BeautifulSoup
try:
with urllib.request.urlopen("http://images.nga.gov/?service=asset&action=show_preview&asset=33643") as url:
s = url.read()
soup = BeautifulSoup(s, 'html.parser')
img = soup.find("img")
urllib.request.urlretrieve(img,"C:\art.jpg")
except Exception as e:
print (e)
Error: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER. expected string or bytes-like object
Can someone please why i am getting this error and how to get the images to my computer.
Upvotes: 0
Views: 95
Reputation: 3656
There is no need to use BeautifulSoup! Just do:
with urllib.request.urlopen("http://images.nga.gov/?service=asset&action=show_preview&asset=33643") as url:
s = url.read()
with open("art.jpg", 'wb') as fp:
fp.write(url.read())
Upvotes: 0
Reputation: 1407
BeautifulSoup is library for html/xml parsing.
On this url you receive image already, so what are you trying to parse?
This works ok: urllib.request.urlretrieve("http://images.nga.gov/?service=asset&action=show_preview&asset=33643" ,"C:\art.jpg")
Upvotes: 1