leora
leora

Reputation: 196891

why do i need to do html.encode()

if i have a user entering data into a rich text editor (tiny editor) and submitting data that i am storing into a database and then retrieving to show on other dynamic web pages, why do i need encoding here.

Is the only reason because someone might paste javascript into the rich text editor? is there any other reason ?

Upvotes: 5

Views: 3442

Answers (9)

Web Logic
Web Logic

Reputation:

Security is the reason.

The most obvious/common reason is Cross-Site-Scripting (XSS). It turns out to be the root cause of the security problems you might witness in your site.

Cross-site scripting (XSS) is a type of computer security vulnerability typically found in web applications that enables malicious attackers to inject client-side script into web pages viewed by other users. An exploited cross-site scripting vulnerability can be used by attackers to bypass access controls such as the same origin policy. Cross-site scripting carried out on websites were roughly 80% of all security vulnerabilities documented by Symantec as of 2007.1 Their impact may range from a petty nuisance to a significant security risk, depending on the sensitivity of the data handled by the vulnerable site, and the nature of any security mitigations implemented by the site's owner.

Additional, as shown in below comments, the layout of your site can also be screwed up.

You need Microsoft Anti-Cross Site Scripting Library

More Resources

http://forums.asp.net/t/1223756.aspx

Upvotes: 16

Flory
Flory

Reputation: 2849

The primary reason to do what your suggesting is to escape your output. Since you are accepting HTML and want to output it you can't do that. What you need to do is filter out thing that user's can do that are insecure, or at least not what you want.

For that, let me suggest AntiSamy.

You can demo it here.

What you are doing has a lot of inherit risks and you should consider it very carefully.

Upvotes: 0

C. Dragon 76
C. Dragon 76

Reputation: 10072

I think you're confusing "encoding" with "scrubbing."

If you want to accept text from a user, you need to encode it as HTML before you render it as HTML. In this way, the text

a < b

is HTML-encoded as

a &lt; b

and rendered in an HTML browser (just as the user entered it) as:

a < b

If you want to accept HTML from a user (which it sounds like you do in this case), it's already in HTML format, so you don't want to call HTML.Encode again. However, you may want to scrub it to remove certain markup that you don't allow (like script blocks).

Upvotes: 3

John Ptacek
John Ptacek

Reputation: 1886

As an aside..... MVC2 has implemented new functionality so you no longer need to call HTML.Encode

if you change your view syntax from

to

MVC will automatically encode for you. It makes thing much easier/quicker. Again, MVC2 only

Upvotes: 1

Atanas Korchev
Atanas Korchev

Reputation: 30661

Another reason is that some user can input a few closing tags </div></table> and potentially break the layout of your web site. If you are using an HTML editing tool make sure the produced html is valid before embedding it in the page without encoding. Some server side parsing is required in order to do this. You can use HtmlAgilityPack to do this.

Upvotes: 0

Dustin Laine
Dustin Laine

Reputation: 38553

Yes, it is to prevent JavaScript from executing if someone were to input malicious string into the rich text editor. However, plain text javascript it not your only concern, for example this is a XSS:

<IMG SRC=&#0000106&#0000097&#0000118&#0000097&#0000115&#0000099&#0000114&#0000105&#0000112&#0000116&#0000058&#0000097&#0000108&#0000101&#0000114&#0000116&#0000040&#0000039&#0000088&#0000083&#0000083&#0000039&#0000041>

Take a look here for a range of different XSS options; http://ha.ckers.org/xss.html

Upvotes: 1

SLaks
SLaks

Reputation: 888283

You're making some mistakes.

If you're accepting HTML-formatted text from the rich-text editor, you cannot call Html.Encode, or it will encode all of the HTML tags, and you'll see raw markup instead of formatted text.

However, you still need to protect against XSS.

In other words, if the user enters the following HTML:

<b>Hello!</b>
<script>alert('XSS!');</script>

You want to keep the <b> tag, but drop (not encode) the <script> tag.
Similarly, you need to drop inline event attributes (like onmouseover) and Javascript URLs (like <a href="javascript:alert('XSS!');>Dancing Bunnies!</a>)

You should run the user's HTML through a strict XML parser and maintain a strict white-list of tags and attributes when saving the content.

Upvotes: 3

Vivian River
Vivian River

Reputation: 32430

Not only could a user enter javascript code or some other naughtiness, you need to use HTML encode in order to display certain characters on the page. You wouldn't want your page to break because your database contained: "Nice Page :->".

Also, if you are entering the code into a database, be sure to "sanatize" the inputs to the database.

Upvotes: 2

Abe Miessler
Abe Miessler

Reputation: 85126

Security is the main reason.

Upvotes: 2

Related Questions