Christophe
Christophe

Reputation: 953

QT HTML Parser (+XQuery)

I'm looking for a QT HTML parser tool. I have some html source code and I'd like to use XQuery on it. I already tried using QWebPage + QWebElement, but I don't like this solution cause firstly it doesn't works on non-gui thread (because of QWebPage) and because we can't apply XPath but CSS Path. The other solution I tried is QXmlQuery, it works great, but the only problem is that it doesn't works if there is an error on the page. For example, the first page I tried was missing systemId (in the DOCTYPE tag), so the parsing was aborted.

I heard we can use gecko for parsing but I have no idea how to use it with QT.

Have you some suggestions ?

Thanks

Upvotes: 1

Views: 1264

Answers (2)

Jens Erat
Jens Erat

Reputation: 38682

BaseX got a QT client and can use TagSoup for cleaning up HTML documents.

I'm sorry I cannot provide you with an QT example as I don't know QT at all.

Upvotes: 1

wcandillon
wcandillon

Reputation: 2166

I recommend that you use tidy on your HTML page and then process it with XQuery.

Zorba is a C++ XQuery processor that provides a tidy module. You can find a live example at http://www.zorba-xquery.com/html/demo#tQZu6aq1K4KoGJm9m0oIPwKRt04=

Upvotes: 1

Related Questions