Reputation: 23
I am trying to scrape data using Selenium and find_elements_by_xpath, but it fails randomly, and I can't see why. Here is the code:
import time
# Open chrome and go to website
driver = webdriver.Chrome()
url = "https://fortnitetracker.com/events/epicgames_S11_DH_Anaheim_Stage3"
driver.get(url)
i = 1;
while i < 100:
i = str(i)
xpath = "/html/body/div[4]/div[2]/div[3]/div[1]/div/div[2]/table/tbody/tr[" + i + "]/td[2]/div"
player = driver.find_elements_by_xpath(xpath)
print(player)
text = player[0].text
print(text)
i = int(i)
i += 1
The error I get is the following:
Traceback (most recent call last):
File "C:/Users/Kristian/PycharmProjects/Tutorial/getnames.py", line 21, in <module>
text = player[0].text
IndexError: list index out of range
However, the error doesn't always occur on the same row. Sometimes it gets 13 rows of data, 14, 15, never more than 18 though. I have no idea why this happens, as the xpath is always correct. Any help is appreciated.
Upvotes: 0
Views: 555
Reputation: 1556
You are using find_elements_by_xpath
, this method always returns an array.
If elements are found it'll return an array of elements, if not, it will return you an empty array.
So if an element didn't show on the page, find_elements_by_xpath
will search, find nothing, and you will get player = []
.
Then, when you do text = player[0].text
, it'll try to find an element with index 0 in the empty array. This will lead to IndexError: list index out of range
.
How to fix this?
An easy solution is to add a short (for exaple 0.5 sec) wait before searching:
time.sleep(0.5)
player = driver.find_elements_by_xpath(xpath)
This solution will work, but it'll slow down your script because 0.5 sec will be added for each iteration of your loop. A more elegant and preferable solution would be to add an implicit or explicit wait, you can read about them in the official Python Selenium Waits doc.
Good luck, I hope this helped.
Upvotes: 1
Reputation: 23
I didn't find an answer for using find_element_by_xpath. However, if you use driver.find_element_by_css_selector it works just fine. So if anyone has the same problem, that is a possible solution.
Upvotes: 0