Ryan
Ryan

Reputation: 3

How to download image from site with no clear extention?

I am trying to download an image from the NGA.gov site using python 3 and urllib.

The site does not display images in a standard .jpg fashion and i get an error.

import urllib.request
from bs4 import BeautifulSoup


try:
    with urllib.request.urlopen("http://images.nga.gov/?service=asset&action=show_preview&asset=33643") as url:
        s = url.read()

    soup = BeautifulSoup(s, 'html.parser') 


    img = soup.find("img")
    urllib.request.urlretrieve(img,"C:\art.jpg")

except Exception as e:
    print (e)

Error: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER. expected string or bytes-like object

Can someone please why i am getting this error and how to get the images to my computer.

Upvotes: 0

Views: 95

Answers (2)

Kafo
Kafo

Reputation: 3656

There is no need to use BeautifulSoup! Just do:

with urllib.request.urlopen("http://images.nga.gov/?service=asset&action=show_preview&asset=33643") as url:
    s = url.read()
with open("art.jpg", 'wb') as fp:
    fp.write(url.read())

Upvotes: 0

Victor Gavro
Victor Gavro

Reputation: 1407

BeautifulSoup is library for html/xml parsing. On this url you receive image already, so what are you trying to parse? This works ok: urllib.request.urlretrieve("http://images.nga.gov/?service=asset&action=show_preview&asset=33643" ,"C:\art.jpg")

Upvotes: 1

Related Questions