anaonyoamosu
anaonyoamosu

Reputation: 23

How to retrieve text from HTML to Python using Selenium

Hi I know there's questions already pertaining to this but I do not have enough experience to figure it out. I am trying to write a simple script in Python that periodically checks the earliest available date on a DMV website. Where I live, the DMV is backed up for months and after just failing a drivers test - I want to snag the earliest available date when somebody cancels their appointment.

Anyways, here is the HTML I am trying to grab from:

          <div _ngcontent-glu-c19="" class="department-appointment-header">Earliest date:</div>
          <br _ngcontent-glu-c19="">
          <div _ngcontent-glu-c19="">
            Monday
          </div>
          <div _ngcontent-glu-c19="">
             May 31st
          </div>
        </div>

Now, I am trying to grab that May 31st date so I can compare it with an Earliest Date variable that continuously updates when there is a sooner date than the existing one. Eventually I will have Python notify me by text. I can't figure out how to retrieve the May 31st element and assign it to a string variable or list, so I can convert the month/day to an integer between 1 - 365.

Please I'm new to Selenium and I haven't touched Python in awhile, I'm quite rusty and all help would be appreciated. If you need more of the HTML code then let me know I'll add more, I just didnt want to fill this entire page.

Upvotes: 1

Views: 46

Answers (2)

undetected Selenium
undetected Selenium

Reputation: 193048

To print the text May 31st you can use either of the following Locator Strategies:

  • Using xpath and class attribute:

    print(driver.find_element(By.XPATH, "//div[@class='department-appointment-header']//following-sibling::div[2]").text)
    
  • Using xpath and textContext:

    print(driver.find_element(By.XPATH, "//div[text()='Earliest date:']//following-sibling::div[2]").text)
    

Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using XPATH and class attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='department-appointment-header']//following-sibling::div[2]"))).text)
    
  • Using XPATH and textContext:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Earliest date:']//following-sibling::div[2]"))).text)
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

Upvotes: 1

Mick
Mick

Reputation: 796

try:

date = driver.find_element_by_xpath("//div[@_ngcontent-glu-c19='']).text

I'm not sure it'll work but it's worth a try

Upvotes: 0

Related Questions