Reputation: 3518
Python's XML Processing Modules documentation lists vulnerabilities in its XML processing modules. I would assume that html5lib is not similarly vulnerable to malicious input as it follows the HTML5 spec (unknown bugs aside), but I hate making assumptions and I can't find discussion of potential security issues.
So are there any security issues I should be aware of? Or is it safe to use it to parse maliciously constructed html?
Upvotes: 0
Views: 98
Reputation: 5692
The short answer is no (at least that anyone is aware of) — the XML attacks take advantage of "features" of XML that don't exist in HTML. (Technically, "decompression bombs" apply to almost any formats, and aren't really attacks on XML — they're attacks on decompressors.)
Upvotes: 2