Reputation: 704
I'm trying to get some information from a website, scamadviser.com.
In particular I'd interested in the final score in the shield (for example, for stackoverflow.com check the value in the shield is 100%).
I've tried to inspect it, and I see that the path is:
I did
def scam(df):
chrome_options = webdriver.ChromeOptions()
trust=[]
country = []
isp_country = []
urls=['stackoverflow.com','GitHub.com']
driver=webdriver.Chrome('mypath',chrome_options=chrome_options))
for x in urls:
wait = WebDriverWait(driver, 20)
response=driver.get('https://www.scamadviser.com/check-website/'+x)
try:
wait = WebDriverWait(driver, 30)
t=driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight", driver.find_element_by_xpath("//div[contains(@class,'trust__overlay shield-color--green') and contains(text(),'icon')]")).get_attribute('innerText')
trust.append(t)
c=driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight", driver.find_element_by_xpath("//div[contains(@class,'block__col') and contains(text(),'Country')]")).get_attribute('innerText')
country.append(c)
ic=driver.find_element_by_xpath("//div[contains(@class,'block__col') and contains(text(),'ISP')]").get_attribute('innerText')
isp_country.append(ic)
except:
trust.append("Error")
country.append("Error")
isp_country.append("Error")
# Create dataframe
dict = {'URL': urls, 'Trust':trust, 'Country': country, 'ISP': isp_country}
df=pd.DataFrame(dict)
driver.quit()
return df
but the dataframe created contains only Errors
(i.e., it executes only the except
in the try/except
).
I can't understand if the error is due to the try/except and/or to the way I look at the element (using xpath). Any help would be great. Thanks
Upvotes: 1
Views: 61
Reputation: 29382
Based on the OP response and for this particular ticket, to get the trusted score
from the website mentioned by OP, the below xpath
has 1/1 matching node in HTML DOM.
Xpath :-
//div[text()='Trustscore']/../following-sibling::div/descendant::div[@class='icon']
You do not need to scroll for this web element, cause as soon as windows is launched, trusted score is in Selenium view port.
Use it with explicit waits
like this :
trusted_score = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Trustscore']/../following-sibling::div/descendant::div[@class='icon']")))
print(trusted_score.text)
for this you'll need imports as well.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
PS : Make sure Selenium windows is launched in full screen mode.
driver.maximize_window()
Update 1 :
data = {'URL': urls,
'Trust': trust,
'Country': country,
'ISP': isp_country}
df = pd.DataFrame.from_dict(data)
Upvotes: 1