Modifying the url parameter to download images from multiple web-sites

Question

I was trying to download images from all the cases included in CaseIDs array, but it doesn't work. I want code to run for all cases.

from bs4 import BeautifulSoup
import requests as rq
from urllib.parse import urljoin
from tqdm import tqdm

CaseIDs = [100237, 99817, 100271]

with rq.session() as s:
    for caseid in tqdm(CaseIDs):
        url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID= {caseid}'
        r = s.get(url)
        soup = BeautifulSoup(r.text, "html.parser")

        url = urljoin(url, soup.find('a', text='Text and Images Only')['href'])
        r = s.get(url)
        soup = BeautifulSoup(r.text, "html.parser")

        links = [urljoin(url, i['src']) for i in soup.select('img[src^="GetBinary.aspx"]')]

        count = 0
        for link in links:
            content = s.get(link).content
            with open("test_image" + str(count) + ".jpg", 'wb') as f:
                f.write(content)
            count += 1

Kingindanord · Accepted Answer

try use format() like this:

url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID={}'.format(caseid)

Modifying the url parameter to download images from multiple web-sites

Answers (2)

Related Questions