Reputation: 27
I am trying to parse content using DocumentBuilder.
<html>
<head>
<meta charset="utf-8" />
<title>Test</title>
</head>
<body>
<img height="" src="google.gif?<>" />
</body>
</html>
I am getting an exception while parsing it that src cannot contain <. I need to parse it as I am applying XSL.
Is there any way to do it. as of now, I am first unescaping it parsing using DocumentBuilder and escaping it again.
I am retrieving the above XML in String format from Database. Now when I am trying to parse it using DocumentBuilder I am getting an exception that src cannot contain <
. I tried to escape it using StringEscapeUtils.EscapeHtml
but it is escaping the complete String and again DocumentBuilder is not able to parse it. Please let me know how to parse src only from HTML as I am not able to accomplish it.
Upvotes: 0
Views: 995