tsnn2d
tsnn2d

Reputation: 141

Trouble targeting elements on website (selenium webdriver)

I am trying to target properties on a real estate website. Ideally, I want to pull the property marketing URL, the title, location, and email of each listing. The properties are all listed as so:

<div class="propertyList">
        <div id="propertyList74495-sale" class="deal_on_market propertyListItem" data-property-id="74495-sale" data-listing-url="http://svncommercialadvisors.com/properties/?propertyId=74495-sale" data-listing-id="148815"></div>

           <table>
             <tbody>
                 <tr>
                   <td class="thumbnail">
                <a target="_top" href="http://svncommercialadvisors.com/properties/?propertyId=74495-sale"></a>
            </td>
            <td class="addressInfo">
                <a target="_top" href="http://svncommercialadvisors.com/properties/?propertyId=74495-sale">

                    Engelberg Antik's

                </a>
                <p class="propertiesListCityStateZip">
                    <img src="/images/map-marker-tiny.png?1427481879" alt="Map-marker-tiny"></img>


                    Salem, OR

                </p>
                <p class="description">

                    Outstanding downtown Salem opportunity, right next…

                </p>
                <div class="smallAttributes">
                    <div></div>
                    <div></div>
                    <div></div>
                </div>
            </td>
            <td class="propertyInfo">
                <div>

                    $479,900

                </div>
                <div>

                    13,612 SF

                </div>
                <div>

                    Street Retail

                </div>
            </td>
        </tr>
    </tbody>
</table>
<div class="contactAdvisor">
    ::before
    <a href="mailto:[email protected]"></a>


    or call
    503.588.0400
    for more information

</div>
<div class="links"></div>

        <div id="propertyList61436-sale" class="deal_under_contract propertyListItem" data-property-id="61436-sale" data-listing-url="http://svncommercialadvisors.com/properties/?propertyId=61436-sale" data-listing-id="124490"></div>

        <div id="propertyList89374-sale" class="deal_on_market propertyListItem" data-property-id="89374-sale" data-listing-url="http://svncommercialadvisors.com/properties/?propertyId=89374-sale" data-listing-id="173124"></div>

        <div id="propertyList84437-sale" class="deal_on_market propertyListItem" data-property-id="84437-sale" data-listing-url="http://svncommercialadvisors.com/properties/?propertyId=84437-sale" data-listing-id="164488"></div>

        <div id="propertyList84478-sale" class="deal_on_market propertyListItem" data-property-id="84478-sale" data-listing-url="http://svncommercialadvisors.com/properties/?propertyId=84478-sale" data-listing-id="164538"></div>

         ...

this was my first attempt at it:

from selenium import webdriver
import sys
import smtplib
import pymongo

newProperties = []

driver = webdriver.Firefox()
driver.get('http://svncommercialadvisors.com/properties/')

for property in driver.find_elements_by_class_name('propertyList'):
    #get title,location 
    info = property.find_elements_by_class_name('addressInfo')
    email = property.find_elements_by_partial_link_text('.com')

When I run the above, it doesn't give any errors that the driver can't locate elements. However, when I print out the elements nothing appears. How can I better locate the elements? I would like for something like this, appended to a list:

-title: Engelberg Antik's
-location: Salem, OR
-url: http://svncommercialadvisors.com/properties/?propertyId=74495-sale
-email: [email protected]

Upvotes: 1

Views: 87

Answers (1)

alecxe
alecxe

Reputation: 473863

The key problem here is that the search results are loaded in an iframe.

You need to switch to iframe before searching for properties.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get('http://svncommercialadvisors.com/properties/')

# wait for frame to appear and switch
frame = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div#buildout iframe")))
driver.switch_to.frame(frame)

for property in driver.find_elements_by_class_name('propertyList'):
    info = property.find_element_by_class_name('addressInfo')
    email = property.find_element_by_partial_link_text('Email')

    print info.text
    print print email.get_attribute('href')

I've also applied two fixes:

  • replaced find_elements_by_class_namme with find_elements_by_class_name
  • replaced property.find_elements_by_partial_link_text('.com') with property.find_element_by_partial_link_text('Email')

It prints:

Engelberg Antik's
Salem, OR
Outstanding downtown Salem opportunity, right next door to the newly renovated Roth and McGilchri...
mailto:[email protected]

Upvotes: 1

Related Questions