Reputation: 22440
I've written a script in python with selenium to grab the Director
name and Phone
number from a webpage. When I execute my script I get the results like below which are in a single list:
['Director: Cheryl Hughley\nPhone: 661-421-5861\nEmail: [email protected]']
How can I parse only the name and the phone number on the fly from that site in separate fields like:
name: Cheryl Hughley
phone : 661-421-5861
This is what I tried that produces the result within a list (first example) above:
from selenium import webdriver
link ="https://www.nafe.com/bakersfield-nafe-network"
def search_info(driver,url):
driver.get(url)
info = [item.text.strip() for item in driver.find_elements_by_css_selector(".markdown p") if "Phone" in item.text]
print(f'{info}')
if __name__ == '__main__':
driver = webdriver.Chrome()
try:
search_info(driver,link)
finally:
driver.quit()
I do not wish to process the result after they are parsed; rather, I wish to get them on the fly. Will regex be a good option here? Thanks.
Upvotes: 1
Views: 42
Reputation: 52665
You can try below solution:
info = [driver.execute_script("return arguments[0].childNodes[arguments[1]].textContent;", item, index).strip() for index in [0, 2] for item in driver.find_elements_by_css_selector(".markdown p") if "Phone" in item.text]
to get output
['Director: Cheryl Hughley', 'Phone: 661-421-5861']
or
info = [driver.execute_script("return arguments[0].childNodes[arguments[1]].textContent;", item, index).split(": ")[-1].strip() for index in [0, 2] for item in driver.find_elements_by_css_selector(".markdown p") if "Phone" in item.text]
to get
['Cheryl Hughley', '661-421-5861']
Upvotes: 1