face
face

Reputation: 81

How to avoid saving escaped data?

As I read before, it is a good practice to save the original data from user input to the database because later it may be used in different contexts, and it needs to be escaped differently depending on the context that it will be appear in.

My problem is the following:

  1. For example a user write an article and hit the save button. It's saved to the database in it's original form (SQL escaping maybe before save).

  2. Later when the user wants to edit the same article, we will escape the text because it will appear in html context when we show it in an editor. So the user will get the html escaped version of the article.

  3. After editing the article the user will save the already escaped version of the text and we will save it to the database in it's "original" (html escaped) form.

At this point we can't use it normally, because it is already in escaped form in the database.

It is not necessary to be an article, imagine that it is a name of a user. We have to escape it, because when it appears in an admin site we need to make sure that the admin won't be xssed. When the admin edits and saves the name it will be saved in escaped form. The user will not be able to login again, because his name (for example) contained an apostrophe (') character and it is escaped to ' or ' and the user will never enter the escaped form of his name.

What is the correct way to handle this types of problems? If I would unescape the input before save I would violate the principle that save the data at original form and I could get wrong results when the user sends unescaped data (new article).

Upvotes: 0

Views: 475

Answers (1)

deceze
deceze

Reputation: 522076

The escaped data is always dependent on its context!
'Foo & \'Bar\' & Baz' as an SQL literal means "Foo & 'Bar' & Baz".
Foo & 'Bar' & Baz in HTML means "Foo & 'Bar' & Baz".

Because the SQL escaped string is interpreted by the database, it appears without its escaping.
Because HTML is interpreted by the browser, it appears without the encoded entities to the user.

Escaping is a mechanism to transport the data intact. It does not alter the data permanently. The user always sees the original data once it has been interpreted by the technology "filter" he's looking at it through.

If you actually have the problem that data appears escaped where it shouldn't, you're escaping one time too many somewhere.

Also see The Great Escapism (Or: What You Need To Know To Work With Text Within Text).

Upvotes: 3

Related Questions