Reputation: 10530
I have html file which contains below content:
<html>
<title><s:message code="test" /></title>
</html>
Java Program:
String input = readFileAsString(filePath);
Document doc = Jsoup.parse(input);
Elements messageEls = doc.select("s|message");
I see output as below:
<html>
<head>
<title><s:message code="test" /></title>
</head>
<body>
</body>
Somehow character <
is converted <
. How can I get original contect without enscape ? Actually I need find elements <s:message
but because of escaping , it's not finding element <s:message code="test" />
?
Upvotes: 1
Views: 1231
Reputation: 18235
Jsoup escapes because <s:message />
not a standard HTML tag.
Try to use XML parser:
Document doc = Jsoup.parse(input, "", Parser.xmlParser());
Create a new XML parser. This parser assumes no knowledge of the incoming tags and does not treat it as HTML, rather creates a simple tree directly from the input.
Upvotes: 1