Marios
Marios

Reputation: 27348

Get next html element with selenium - python

I have the following html body that gives a list of elements. Please keep in mind that this html is just for demonstration. In the actual body, the list contains more than 20 properties.

<dl>
   <dt class="sc-ellipsis">Merk</dt>
   <dd>
       <a href="https://www.autoscout24.nl/auto/audi/">Audi</a>
   </dd>
   <dt class="sc-ellipsis">Model</dt>
   <dd>
       <a href="/lst/audi/q3">Q3</a>
   </dd> ....more properties like that
</dl>

I would like to get the words: Audi and Q3

I can simply do this in Selenium:

browser.find_elements_by_css_selector('dd')[0].text # to get Audi
browser.find_elements_by_css_selector('dd')[1].text # to get Q3

BUT sometimes some of the elements might be missing, therefore I can not rely on the position mentioned above. For example if Audi is missing, then this:

browser.find_elements_by_css_selector('dd')[0].text # now it returns Q3

returns Q3. One common pattern is that Audi will always follow Merk and Q3 will always follow Model . Namely, if Merk is not in the html body Audi won't be either. What I tried is to find the very next html element of Merk:

WebDriverWait(browser, 10).until(EC.visibility_of_all_elements_located((By.XPATH, './/[(@class="sc-ellipsis") and (text()="Merk")]/following-sibling::dd')))[0].text

But this returns an empty list which means it didn't find Audi. Does anyone know how to get the next element of Merk (or Model or whatever comes next in the list) ? I can create a catcher myself, so if Merk is not part of the list, then don't try to get the next element.

Upvotes: 0

Views: 425

Answers (1)

Sri
Sri

Reputation: 2318

The following code will return the text of the dd following the dt with text "Merk"

from selenium import webdriver
browser = webdriver.Chrome()
browser.get('https://www.autoscout24.nl/aanbod/audi-q3-sportback-pro-line-business-35-tfsi-110-kw-150-p-benzine-zilver-757ef256-c967-457b-8db1-4cb8b287c311?cldtidx=19')
elem = browser.find_element_by_xpath('//dt[text()="Merk"]/following-sibling::dd')
print(elem.text)

After examining your code, it seems the only issue was that you were not stating the tag type of the first tag. Either use wildcard, or dt.

'.//*[(@class="sc-ellipsis") and (text()="Merk")]/following-sibling::dd'
'.//dt[(@class="sc-ellipsis") and (text()="Merk")]/following-sibling::dd'

Upvotes: 3

Related Questions