Reputation: 4573
I don't really understand the readme of htmlparser.. and I searched over the internet but cannot find a proper tutorial for it (or other NodeJS parsers).
I believe for most of the time if there's no tutorial for a pretty complete and old library it's mostly because that it's easy to do thus people don't really feel the need to write tutorial for it... But I found NodeJS html parser is pretty hard to understand...
Upvotes: 3
Views: 6201
Reputation: 3241
You should check out htmlparser2. It's the newer htmlparser and it's got a decent readme. The way I tend to use it isn't streamish, and thus looks something like this:
handler = new htmlparser.DomHandler(function(err, dom) {
// ... DO CODE HERE
})
new htmlparser.Parser(handler).parseComplete(html_string)
For the code inside the handler function, I use soupselect because it's documented and I'm lazy, but htmlparser2 guys suggest domutils, but it has no documentation.
Upvotes: 6