Jsoup select text after tag with many tag

Question

I want to extract a text after each text using jsoup. Is there any way to select it?

Example code like below:



    
    A: *thank you* **I want to retrieve this text**

    B: *Bla..bla* *I don't want this text*

    C: *what ever text* *I dont want this*                         
        D: *anythinh text* *I want this*

        E: *Bla..bla* *I don't want this text*t

        F: *anythinh text* *I want this*

    

    I want this

and when it finish it creates auto id example id=123

Pshemo · Accepted Answer

If we can assume that all elements which you want to find will always contain A: or D: or F: then with strong:matchesOwn(regex) (where regex will represent A:|D:|F:) we can select those elements.

After handling strong we can move on to second
and get its textual content via text().

String html = "
" + "
" + " " + " A: *thank you* **I want to retrieve this text** " + " B: *Bla..bla* *I don't want this text* " + " C: *what ever text* *I dont want this* " + " D: *anythinh text* *I want this* " + " E: *Bla..bla* *I don't want this text*t " + " F: *anythinh text* *I want this* " + " " + " " + " I want this"; Document doc = Jsoup.parse(html); Elements pElements = doc.select("#summary p"); Elements strongElements = pElements.first().select("strong:matchesOwn(A:|D:|F:)"); for (Element strong : strongElements) { System.out.println(strong.nextSibling());//get next element, including textual element } System.out.println("---"); System.out.println(pElements.get(1).text());//textual content of I want this

Output:

*thank you* **I want to retrieve this text** *anythinh text* *I want this* *anythinh text* *I want this* --- I want this

If you don't want to rely on content of but simply on its indexes then pick all of them like

Elements allStrElemens = doc.select("#summary p strong");

and simply pick ones you needed via their indexes (remember that indexes start from 0) like

System.out.println(allStrElemens.get(0).nextSibling()); System.out.println(allStrElemens.get(3).nextSibling()); System.out.println(allStrElemens.get(5).nextSibling());

Jsoup select text after tag with many tag

Answers (1)

Related Questions