Reputation: 48
I have an error htmlparsing . I think the problem stems from the quotation marks DjNative language=javascript error language="javascript" I try all version of Dj native library
[Fatal Error] :2:18: Open quote is expected for attribute "{1}" associated with an element type "language". org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 18; Open quote is expected for attribute "{1}" associated with an element type "language". at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
private Document HTMLtoXML(String source)
{
Document doc = null;
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
InputSource src = new InputSource(new StringReader(source));
doc = builder.parse(src);
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return doc;
}
public void StartTakip()
{
String htmlSource=webbrowser.getHTMLContent();
dc = HTMLtoXML(htmlSource);
}
When I try to get page source code a html page via DJNative Swing
<HTML>
<HEAD>
<SCRIPT language=javascript src="/medula/scripts/capFirstLetters.js"></SCRIPT>
<TITLE>deneme</TITLE>
</HEAD>
<BODY bgcolor=#233333>
</BODY>
</HTML>
If source like below,html parse is work well
<HTML>
<HEAD>
<SCRIPT language="javascript" src="/medula/scripts/capFirstLetters.js"></SCRIPT>
<TITLE>deneme</TITLE>
</HEAD>
<BODY bgcolor="#233333">
</BODY>
</HTML>
Upvotes: 1
Views: 678
Reputation: 48
I solve this problem with Jsoup-1.7.3.jar Ex:
JWebBrowser jwebbrowser=new Jwebbrowser();
Document doc=Jsoup.parse(jwebbrowser.getHTMLContent);
Upvotes: 1