Reputation: 20115
My site has user generated content. I noticed that if the user has quotes in some text and later I displayed that text in an HTML attribute, the layout would get screwed up in IE.
<a href="link.html" title="user "description" of link">Hello</a>
However, if I had generated the same anchor with Javascript (Prototype library), the layout would not be screwed up in IE:
$$('body').first().appendChild(
new Element(
'a', {
title: 'user "description" of link',
href: 'link.html'
}
).update('Hello')
);
Why is this so? The JS and the plain HTML versions both have the same intended result, but only the JS doesn't screw up IE. What's happening behind the scenes?
BTW, I do strip_tags() and clean XSS attacks from all user input, but I don't strip all HTML entities because I use a lot of form text input boxes to display back user generated text. Form elements literally display HTML entities, which looks ugly.
Upvotes: 1
Views: 251
Reputation: 839
The answer to your question: 'Why is it so' is because in your JavaScript example set the title attribute with single quotes. So the double quotes in the user generated string are already escaped.
In you A tag example, single quotes around the text you use in the title attribute may be a way to solve the rendering problem.
However, Your HTML attributes should be in double quotes, so you would be better off using entities, as suggested by @elusive in his answer.
Upvotes: 0
Reputation: 1653
I don't know how you are processing the user generated content, but you could use a replace function to clean up the input something like string.replace("\"", "")
Upvotes: 0
Reputation: 30986
You need to escape all output that is user-specified (using entities). The DOM-methods do that automatically.
Upvotes: 5