How to scrape the #text xml using selenium webdriver python?

Question

I am trying to scrape this website Century Office Products, Inc and I am unable to scrape this text:

Century Office Products, Inc. industry is listed as Ret Misc Merchandise

as the tag in which it is contained is #text. Following is the code I have tried:

driver.get('https://www.corporationwiki.com/New-Jersey/Middlesex/century-office-products-inc/53844156.aspx')
text = [k.text for k in driver.find_elements_by_xpath("//div[@class='card']//div[@class='card-body']//h2//following::p[2]")]

undetected Selenium · Accepted Answer

To extract the text Century Office Products, Inc. using Selenium you need to use WebDriverWait for the visibility_of_element_located() and you can use the following Locator Strategy:

Xpath:

Code Block:

chrome_options = webdriver.ChromeOptions() 
chrome_options.add_argument("start-maximized")
chrome_options.add_argument('disable-infobars')
chrome_options.add_argument('--allow-running-insecure-content')
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://www.corporationwiki.com/New-Jersey/Middlesex/century-office-products-inc/53844156.aspx")
print(driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[@itemprop='legalName']")))).strip())

Console Output:
```
Century Office Products, Inc.
```

How to scrape the #text xml using selenium webdriver python?

Answers (2)

Related Questions