Reputation: 4434
I am trying to extract some data from a website. However, the site has a hierarchical structure. It has a dropdown menu on the top, whose option valueare URLs. Thus, my approaches are:
Below is my code, I am able to extract data under the default selected option (the first one). But I got error Message: Element not found in the cache - perhaps the page has changed since it was looked up
. It seems like my browser was not switched to the new page. I tried time.sleep()
or driver.refresh()
, but failed... Any suggestions are appreciated!
<select class="form-control">
<option value="/en/url1">001 Key</option>
<option value="/en/url2">002 Key</option>
</select>
# select the dropdown menu
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
# get all options
options = select_box.options
for ele_index, element in enumerate(options):
# select a url
select_box.select_by_index(ele_index)
time.sleep(5)
print element.text
# extract page data
id_comp_html = driver.find_elements_by_class_name('HorDL')
for q in id_comp_html:
print q.get_attribute("innerHTML")
print "============="
# dropdown menu
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
options = select_box.options
for ele_index in range(len(options)):
# select a url
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
print select_box.options[ele_index].text
select_box.select_by_index(ele_index)
# print element.text
# print "======"
driver.implicitly_wait(5)
id_comp_html = driver.find_elements_by_class_name('HorDL')
for q in id_comp_html:
print q.get_attribute("innerHTML")
print "============="
Upvotes: 2
Views: 3203
Reputation: 3130
From your description of the site and your code, it looks like selecting an option from the dropdown sends you to another page; so after your first iteration in the for loop, you have moved to a different page, while your options
variable is pointing to elements that are in the previous page.
One solution, specific to your situation (and likely the best in this case), would be to store the option values (i.e. the urls) instead, and directly navigate to those urls instead via the .get()
method.
Otherwise, you would either need to keep a counter and get the contents of the dropdown with every iteration, or navigate backwards after every iteration, both choices being unnecessary in this case.
Upvotes: 1
Reputation: 473773
Your select_box
and element
references got stale, you have to "re-find" the select element while operating the option indexes inside the loop:
# select the dropdown menu
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
# get all options
options = select_box.options
for ele_index in range(len(options)):
# select a url
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
select_box.select_by_index(ele_index)
# ...
element = select_box.options[ele_index]
You might also need to navigate back after selecting an option and extracting the desired data. This can be done via driver.back()
.
Upvotes: 1