Reputation: 83
I'm trying to scrape that website for the Captcha image link.
Using browser inspect element it's already appear but upon scraping it's not shown.
My target were to getting the img
Below is my code which i tried with it.
import requests
from bs4 import BeautifulSoup
with requests.Session() as s:
url = "https://myurl.com/"
r = s.get(url)
soup = BeautifulSoup(r.content, "html.parser")
for item in soup.findAll("img"):
print(item)
Upvotes: 1
Views: 115
Reputation: 487
Like others have said, selenium will help load the img allowing you to scrape it.
from selenium import webdriver
from bs4 import BeautifulSoup
import time
browser = webdriver.Firefox()
url = 'https://myurl.com/'
browser.get(url)
time.sleep(10) # wait 10 seconds for the captcha to load
html = browser.page_source
soup = BeautifulSoup(html,features='html.parser')
imgs = soup.find_all('img')
for img in imgs:
print(img)
Returns:
<img alt="" id="yw1" src="/site/captcha/v/5dd3ccb47dd88/"/>
Upvotes: 0
Reputation: 33384
If you go to 'NetWork' tab you will get below link which returns the captcha image in JSON format. You don't need Selenium for that.
You need to convert response into JSON and then get the url
key val.
import requests
with requests.Session() as s:
url = "https://example.com/site/captcha/refresh/1/?_=1574163338269"
r = s.get(url, verify=False)
img = r.json()
print(img['url'])
NetworkTab
Upvotes: 1