SublimizeD
SublimizeD

Reputation: 134

Looping through a list of XPaths using Regex, python

Here is my code

src = driver.page_source
XPATHLoop=['//*[@id="General"]/fieldset/dl/dd[15]', '//*[@id="General"]/fieldset/dl/dd[14]', '//*[@id="General"]/fieldset/dl/dd[13]','//*[@id="General"]/fieldset/dl/dd[12]']
for d in XPATHLoop:
    Checkpath = re.search(d,src)
    if Checkpath =='//*[@id="General"]/fieldset/dl/dd[15]':
        Status= driver.find_element_by_xpath('//*[@id="General"]/fieldset/dl/dd[8]').text
        break
    elif  Checkpath == '//*[@id="General"]/fieldset/dl/dd[14]':
        Status= driver.find_element_by_xpath('//*[@id="General"]/fieldset/dl/dd[7]').text
        break
    elif Checkpath == '//*[@id="General"]/fieldset/dl/dd[13]':
        Status = driver.find_element_by_xpath('//*[@id="General"]/fieldset/dl/dd[6]').text
        break
    elif Checkpath == '//*[@id="General"]/fieldset/dl/dd[12]':
        Status= driver.find_element_by_xpath('//*[@id="General"]/fieldset/dl/dd[5]').text
        break
    else:
        Status= "NULL"
print(Status)

Output is 'NULL' meaning that it isn't finding anything, while these paths do/can exist in the source. I am currently using selenium and regex. I'm currently thinking that there may be another method in regex to pull the xpaths.

Upvotes: 0

Views: 86

Answers (3)

SublimizeD
SublimizeD

Reputation: 134

My solution to this bug is listed below I was able to perform a nest while loop within a for loop. My list XpathLoop shows the numerical values of the dd items. This loops through until it finds my desired dt 'Status Date'. After it finds the desired text it enteres a while loop until the status date format appears. The dd items are completely independent of the dt items on the webpage.

XPATHLoop=[4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]

for d in XPATHLoop:
  try:
  findtool = driver.find_element_by_xpath('//[@id="General"]/fieldset/dl/dt['+str(d)+']')
        if findtool.text == 'Status Date':
            Status=driver.find_element_by_xpath('//[@id="General"]/fieldset/dl/dd['+str((d))+']').text

            i=1
            while re.search(r'(\d{4})',Status) == None:
                Status= driver.find_element_by_xpath('//*[@id="General"]/fieldset/dl/dd['+str((d+i))+']').text
                i=i+1
  except:
     pass
  print(Status)

Upvotes: 1

JeffC
JeffC

Reputation: 25611

From the code you have posted, the strings you have hardcoded are all the same other than the index of the DD element. Since the Status index is always 7 less than the index of Checkpath, you can just loop through 15-12, do your search, and then Status is just 15-7. The code is below.

src = driver.page_source
loop = [15,14,13,12]

for d in loop:
    Checkpath = re.search('//*[@id="General"]/fieldset/dl/dd[' + str(d) + ']',src)
    Status = driver.find_element_by_xpath('//*[@id="General"]/fieldset/dl/dd[' + str(d - 7) + ']').text
print(Status)

Upvotes: 1

Sers
Sers

Reputation: 12255

To parse HTML and XML documents (page source) and get elements with locators, you can use , how to use it.

Regular expression do not parsing HTML documents. You get NULL because Checkpath = re.search(d,src) is None.

Here is example how you can get status without loop and parsing page source.

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 10)

p = wait.until(EC.visibility_of_all_elements_located((By.XPATH, '//*[@id="General"]/fieldset/dl/dd')))

status = "NULL"

r = range(0, len(p))
if 15 in r:
    status = p[8].text
elif 14 in r:
    status = p[7].text
elif 13 in r:
    status = p[6].text
elif 12 in r:
    status = p[5].text

Upvotes: 2

Related Questions