Selenium - find element by link text

Question

I am using selenium webdriver on Chrome; python 3 on Windows 10. I want to scrape some reports from a database. I search with a company ID and a year, the results are a list of links formatted in a specific way: something like year_companyID_seeminglyRandomDateAndDoctype.extension, e.g. 2018_2330_20020713F04.pdf. I want to get all pdfs of a certain doctype. I can grab all links for a certain doctype using webdriver.find_elements_by_partial_link_text('F04') or all of that extension with '.pdf' instead of 'F04', but I cannot figure out a way to check for both at once. First I tried something like

links = webdriver.find_elements_by_partial_link_text('F04')
for l in links:
    if l.find('.pdf') == -1:
        continue
    else:
        #do some stuff

But unfortunately, the links are WebElements:

print(links[0])
>> 
print(links[0].get_attribute('href'))
>> javascript:readfile2("F","2330","2015_2330_20160607F04.pdf")

so the conditional in the for loop above fails.

I see that I could probably access the necessary information in whatever that object is, but I would prefer to do the checks first when getting the links. Is there any way to check multiple conditions in the webdriver.find_elements_by_* methods?

Andersson · Accepted Answer

You can try to use below code

links = [link.get_attribute('href') for link in webdriver.find_elements_by_partial_link_text('F04') if link.get_attribute('href').endswith('.pdf")')]

You can also try XPath as below

links = webdriver.find_elements_by_xpath('//a[contains(., "F04") and contains(@href, ".pdf")]')

Selenium - find element by link text

Answers (2)

Related Questions