Reputation: 23
I'm trying to build a simple scraper with selenium to retrieve zip code for a given address, city, ST from this USPS tool: https://tools.usps.com/zip-code-lookup.htm?byaddress
Here is my code which is working on most steps but I am struggling to get the data I need to retrieve at the very end (the zip code):
from selenium import webdriver
import time
import pandas as pd
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
# step 1: set path to chromedriver downloaded from here: https://chromedriver.chromium.org/downloads
PATH = "chromedriver" # modify to location on your system of above downloaded driver
driver = webdriver.Chrome(PATH)
driver.get("https://tools.usps.com/zip-code-lookup.htm?byaddress")
# step 2: specify the address to search for
street_address = "530 WILLIAM PENN PL"
city = "PITTSBURGH"
state = "PA"
# step 3: fill out the form with specified data in step 2
input_address = driver.find_element_by_id('tAddress')
input_city = driver.find_element_by_id('tCity')
drpState = Select(driver.find_element_by_id('tState'));
input_address.send_keys(street_address)
time.sleep(1)
input_city.send_keys(city)
time.sleep(1)
drpState.select_by_value(state)
# step 4: select "Find button" on USPS page to advance
button_find = driver.find_element_by_id('zip-by-address')
button_find.click()
time.sleep(2)
# step 5: retrieve zip code (the problem)
zipcode= driver.find_element(By.XPATH, '//*[@id="zipByAddressDiv"]/ul/li[1]/div[1]/p[3]/strong')
attrs=[]
for attr in zipcode.get_property('attributes'):
attrs.append([attr['name'], attr['value']])
print(attrs)
As you can see in below screenshot, at the very end I specify an XPATH which I obtained by inspecting the zip code. I then try to list the attributes of the zipcode WebDriver object but it comes out empty, there is no error just nothing returns in the attributes of the object.
Would appreciate any help, thank you in advance.
Upvotes: 0
Views: 512
Reputation: 142641
I have no idea why you try to use get_attributes
.
To get some attributes this page would need <strong name="..." value="...">
but it has only <strong>
If you want tag name then use zipcode.tag_name
and if you want text 15219-1820
inside <strong> </strong>
then use zipcode.text
zipcode = driver.find_element(By.XPATH, '//*[@id="zipByAddressDiv"]/ul/li[1]/div[1]/p[3]/strong')
print(zipcode.tag_name)
print(zipcode.text)
Upvotes: 1
Reputation: 3
(Referencing off the image)
You could instead get the element zipcode-by-address
and get it's child classes and find the strong
driver = webdriver.Firefox()
... # navigation stuff here.
element = driver.find_element_by_class_name("zipcode-by-address")
all_children_by_xpath = header.find_elements_by_xpath(".//*") # https://stackoverflow.com/questions/24795198/get-all-child-elements
Upvotes: 0