Reputation: 3782

How to browse over a page using PhantomJS and Selenium

I got some DIV elements on a web page. Totally there are abound 30 DIV blocks of the following similar structure:

<div class="w-dyn-item">  
<a href="/project/soft" class="jobs-wrapper no-line w-inline-block w-clearfix">
<div class="jobs-client">
<img data-qazy="true" src="https://global.com/test.jpg" alt="Soft" class="image-9">
<div style="background-color:#cd7f32" class="job-time">Level 1</div>
</div>
<div class="jobs-content w-clearfix">
<div class="w-clearfix">
<div class="text-block-19 w-condition-invisible">PROMO</div>
<h3 class="job-title">Soft</h3>
<img height="30" data-qazy="true" src="https://global.com/test.jpg" alt="Soft" class="image-15 w-hidden-main w-hidden-medium w-hidden-small"></div>
<div class="div-block w-clearfix">
<div class="text-block-4">Italy</div>
<div class="text-block-4 w-hidden-small w-hidden-tiny">AMB</div>
<div class="text-block-4 w-hidden-small w-hidden-tiny">GTL</div>
<div class="text-block-13">January 10, 2017</div><div class="text-block-14">End date:</div></div><div class="space small"></div><p class="paragraph-3">Text text text</p></div>   
</a> 
</div>

I am trying to access a href and click on the link. However, the problem is that I cannot use find_element_by_link_text, because the link text does not exist. Is it possible to access a href by class class="jobs-wrapper no-line w-inline-block w-clearfix"? When I used find_element_by_class_name, I got the error Message: {"errorMessage":"Compound class names not permitted","request

from selenium import webdriver
driver = webdriver.PhantomJS()
driver.set_window_size(1120, 550)
driver.get("https://myurl.com/")
driver.find_element_by_link_text("//a href").click()
print driver.current_url
driver.quit()

Upvotes: 0

Answers (3)

Evya

Reputation: 2375

The error you're getting is because Selenium's find_element_by_class_name does not support multiple classes.
Use a CSS selector with find_elements_by_css_selector instead:

driver.find_elements_by_css_selector('.jobs-wrapper.no-line.w-inline-block.w-clearfix')

Will select all tags with your wanted class, then you can iterate over them and use click() or any other wanted action

EDIT

Following your comment, new snippet to help you do what you want:

result = {}
urls = []
# 'elements' is a the list you previously obtained using the css selector
for element in elements:
    urls.append(element.get_attribute('href'))


# Now you can iterate over all extracted hrefs:
for url in urls:
    url_data = {}
    driver.get(url)
    field1 = driver.find_element_by_id('wanted_id_1')
    url_data['field1'] = field1
    field2 = driver.find_element_by_id('wanted_id_2')
    url_data['field2'] = field2
    result[url] = url_data

Now, result is a dictionary in a structure similar to what you wanted.

Note that field1 and field2 are of type WebElement so you'll probably need to do something with them first (extract attribute, text, etc).

Also, on personal note, Look into the requests together with BeautifulSoup, they might be a way better fit than Selenium for this or future similar cases.

Upvotes: 2

Omar Einea

Reputation: 2524

If your only requirement is to click the a tag inside a tag with w-dyn-item class, then you could do it like this:

driver.find_element_by_class_name("w-dyn-item").find_element_by_tag_name("a").click()

To iterate over all tags with w-dyn-item class -> click the a inside them -> do something -> go back, do this:

tags = driver.find_elements_by_class_name("w-dyn-item")
for i in range(len(tags)):
    tag = driver.find_elements_by_class_name("w-dyn-item")[i]
    tag.find_element_by_tag_name("a").click()
    # Do what you want inside the page...
    driver.back()

The key here is of course to go back to the root page after you're done with the inner page.

Upvotes: 2

undetected Selenium

Reputation: 193338

To access and click the a href you can use the following line of code :

driver.find_element_by_xpath("//div[@class='w-dyn-item']/a[@href='/project/soft']").click()

Upvotes: 1

How to browse over a page using PhantomJS and Selenium

Answers (3)

Related Questions