Reputation: 25
I was trying to download images from all the cases included in CaseIDs array, but it doesn't work. I want code to run for all cases.
from bs4 import BeautifulSoup
import requests as rq
from urllib.parse import urljoin
from tqdm import tqdm
CaseIDs = [100237, 99817, 100271]
with rq.session() as s:
for caseid in tqdm(CaseIDs):
url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID= {caseid}'
r = s.get(url)
soup = BeautifulSoup(r.text, "html.parser")
url = urljoin(url, soup.find('a', text='Text and Images Only')['href'])
r = s.get(url)
soup = BeautifulSoup(r.text, "html.parser")
links = [urljoin(url, i['src']) for i in soup.select('img[src^="GetBinary.aspx"]')]
count = 0
for link in links:
content = s.get(link).content
with open("test_image" + str(count) + ".jpg", 'wb') as f:
f.write(content)
count += 1
Upvotes: 1
Views: 63
Reputation: 2036
try use format()
like this:
url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID={}'.format(caseid)
Upvotes: 2
Reputation: 1368
You need to use an f-string to pass your caseId
value in, as you're trying to do:
url = f'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID= {caseid}'
(You probably also need to remove the space between the =
and the {
)
Upvotes: 2