Reputation: 953
I'm looking for a QT HTML parser tool. I have some html source code and I'd like to use XQuery on it. I already tried using QWebPage + QWebElement, but I don't like this solution cause firstly it doesn't works on non-gui thread (because of QWebPage) and because we can't apply XPath but CSS Path. The other solution I tried is QXmlQuery, it works great, but the only problem is that it doesn't works if there is an error on the page. For example, the first page I tried was missing systemId (in the DOCTYPE tag), so the parsing was aborted.
I heard we can use gecko for parsing but I have no idea how to use it with QT.
Have you some suggestions ?
Thanks
Upvotes: 1
Views: 1264
Reputation: 38682
BaseX got a QT client and can use TagSoup for cleaning up HTML documents.
I'm sorry I cannot provide you with an QT example as I don't know QT at all.
Upvotes: 1
Reputation: 2166
I recommend that you use tidy on your HTML page and then process it with XQuery.
Zorba is a C++ XQuery processor that provides a tidy module. You can find a live example at http://www.zorba-xquery.com/html/demo#tQZu6aq1K4KoGJm9m0oIPwKRt04=
Upvotes: 1