Reputation: 33
Hello I am extremely sorry for this long post but I wanted to make sure that the problem is understandable .. I am new to selenium.. from this website: "https://xangle.io/project/list" when I click on any of the following elements it takes me to the new page.
I want to scrape the links of each of these elements.
But the problem is when i inspect those elements looking for URLs I dont find any URls in the html. Here is the screenshot of the html codes:
I took a look at the inspection area of the elements but couldnot find any link, (maybe i have missed it).
Any way this is what I tried but i dont think its the correct solution:
driver = webdriver.Chrome(r'C:\Users\User\AppData\Local\Programs\Python\Python37\Lib\site-packages\chromedriver_py\chromedriver_win32.exe')
driver.get('https://xangle.io/project/list')
wait = WebDriverWait(driver, 15)
wait.until(EC.element_to_be_clickable((By.XPATH, "//div[@class='project-table']//div[@class='table-row']//div[3]")))
list_ = driver.find_elements_by_xpath("//div[@class='project-table']//div[@class='table-row']//div[3]")
for i in list_:
i.click()
print(driver.current_url)
driver.back()
It throws an error:
StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
(Session info: chrome=80.0.3987.163)
Frankly speaking I dont want to get rid of the error I want to find a correct way of scraping the urls that doesnot show up when inspected
Upvotes: 0
Views: 1218
Reputation: 1285
If you inspect network tab, you can find that those data are from it's API: https://api.xangle.io/project/list?items_per_page=50&page=0
If you take a look at the link in each project, you will see that it's a prefix link and his symbol.
import requests
url = "https://api.xangle.io/project/list?items_per_page=50&page=0"
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36'}
r = requests.get(url, headers=headers)
prefix = "https://xangle.io/project/"
data = r.json()
links = [prefix+d["symbol"] for d in data]
Upvotes: 2
Reputation: 1597
When a page is reloaded, previously found elements turn stale, because the document you are working with is not the same document where the elements were found.
What you could do is change your pattern a bit and do not reuse the list of elements:
driver.get('https://xangle.io/project/list')
wait = WebDriverWait(driver, 15)
wait.until(EC.element_to_be_clickable((By.XPATH, "//div[@class='project-table']//div[@class='table-row']//div[3]")))
list_ = driver.find_elements_by_xpath("//div[@class='project-table']//div[@class='table-row']//div[3]")
names = [x.text for x in list_ if x.text]
for name in names:
elem = wait.until(EC.element_to_be_clickable((By.XPATH, f'//div[@class="project-table"]//div[@class="table-row"]//div[3]//span[text()="{name}"]/..')))
elem.click()
print(driver.current_url)
driver.back()
Upvotes: 1