How to extract text from HTML (after certain string)

Question

I have the following HTML:


    
        ::before
    
    
        abc: 
        st1

And I want to extract str1 which always happens after abc. I was able to do it by using the XPATH link:

xpath('.//b[@class = "text-primary text-dark"]')[0].text

But the solution depended on it being the first appearance of this particular class, which appears more than once and isn't always in the same order. I was wondering if there was a way to search the HTML for abc and pull the subsequent text?

lauda · Accepted Answer

Maybe find the element that contains abc, navigate to child/parent if needed, get text.
Example of selectors:

Find any(* is for any tag) element that contains abc text and select any child.
//*[contains(text(), 'abc')]/*
Find any(* is for any tag) element that contains abc text and select his b child.
//*[contains(text(), 'abc')]/b
Find li element that has an element which contains text abc and select b element from inside it (inside li), use // since b is not first child of li.
//li[.//[contains(text(), 'abc')]]//b

If you know abc then start from there, see what element is returned and if needed to navigate to parent/ancestor/child.

For more about xpath please see w3schools xpath selectors

How to extract text from HTML (after certain string)

Answers (2)

Related Questions