Reputation: 788
The HTML
<td> SCH4UE-01 : Chemistry <br> Block: 1 - rm. 315 </br></td>
I don't want the br
tag, but I do want all of the other text (SCH4UE-01 : Chemistry
)
CSS queries I have tried
td:eq(0)
outputs: SCH4UE-01 : Chemistry Block: 1 - rm. 315
however
br
outputs: Block: 1 - rm. 315
Upvotes: 1
Views: 110
Reputation: 2876
The
<br>
tag is an empty tag which means that it has no end tag.
See: http://www.w3schools.com/tags/tag_br.asp
Replacing your </br>
tag with <br>
(if you print the jsoup document you will see, that jsoup fixes such mistakes automatically) your <td>
tag has four childnodes:
#text
br
#text
br
So the text SCH4UE-01 : Chemistry
is the first childnode (element.childNode(0)
).
Code
String htmlString = "<html><body><table><td> SCH4UE-01 : Chemistry <br> Block: 1 - rm. 315 <br></td></table></body></html>";
Document doc = Jsoup.parse(htmlString);
Elements tdElements = doc.select("td");
for (Element tdElement : tdElements){
System.out.println(tdElement.childNode(0));
}
Output
SCH4UE-01 : Chemistry
Upvotes: 1