Alex
Alex

Reputation: 223

Get part of string that is not html in Java

In my Java application I have String that have to be edited. The problem is that these Strings can contain HTML tags/elements, which should not be edited (no id to retrieve element).

Scenario (add -):

String a = "<span> <table> </table>  </span> <div></div> <div> text 2</div>";
should become: <span> <table> </table>  </span> <div></div> <div> -text 2</div>  

String b = "text";
should become: -text

String c = "<p> t </p>";
should become: <p> -t </p>  

My question is: How can I retrieve the text in a string that can contain html tags (cannot add id or class)

Upvotes: 0

Views: 93

Answers (1)

ddavison
ddavison

Reputation: 29042

You can use an XML parsing library.

String newText = null;
for ( Node node : document.nodes() ) {
  if ( node.text() != null ) newText = "-" + node.text();
}

note that this is pseudo.

newText will now be -text or whatever the node text is.

EDIT: Your question is a bit ambiguous in terms of "the text can contain html elements."
If it doesn't contain html tags, then you cannot use an XML parser, which brings up the question.. if it doesn't contain tags, then why can't you just do...

String newString = "-" + a;

Upvotes: 3

Related Questions