Reputation: 61388
I'm looking at the source of a third party HTML sanitizer. After parsing the HTML into a DOM tree, the code does two things:
UrlPathEncode
What is the latter for, what kind of attack is it meant to prevent? Some flavor of XSS most likely, but which pathway? Sneaking JavaScript in event handler attributes will be prevented by the white list, won't it?
Meanwhile, unconditional url-encoding of all attributes will mess up some user visible text, like alt
on images and title
on links. The browser doesn't url-decode those, I've checked.
Upvotes: 1
Views: 348