Reputation: 102765
I am thinking of secure ways to serve HTML and JSON to JavaScript. Currently I am just outputting the JSON like:
ajax.php?type=article&id=15
{
"name": "something",
"content": "some content"
}
but I do realize this is a security risk -- because the articles are created by users. So, someone could insert script tags (just an example) for the content and link to his article directly in the AJAX API. Thus, I am now wondering what's the best way to prevent such issues. One way would be to encode all non alphanumerical characters from the input, and then decode in JavaScript (and encode again when put in somewhere).
Another option could be to send some headers that force the browser to never render the response of the AJAX API requests (Content-Type
and X-Content-Type-Options
).
Upvotes: 5
Views: 998
Reputation: 67019
If you set the Content-Type
to application/json
then NO Browser will execute JavaScript on that page. This is apart of RFC-4627, and Google uses this to protect them selves. Other Application/
Content types follow similar rules.
You still have to worry about DOM Based XSS, however this would be a problem with your JavaScript, not really the content of the json. Another more exotic security concern with Json is information leakage like this vulnerability in gmail.
Make sure to always test your code. There is the Sitewatch free xss scanner, or the open source Skipfish and finally you could test this manually with a simple <script>alert(/xss/)</script>
.
Upvotes: 7
Reputation: 1800
Instead of worrying about how you could encode the malicious code when you return it, you should probably take care that it does not even get into your database. A quick google search about preventing cross-site scripting and input validation might help you here. Cheers
Upvotes: 4
Reputation: 26979
For outputting safe html from php, I recommend http://htmlpurifier.org/
Upvotes: 0
Reputation: 2075
If the user has to be logged in to view the web page then secure the ajax.php with the same authorization mechanism. Then a client that's not logged in cannot access ajax.php directly to retrieve the data.
Upvotes: 1
Reputation: 413682
Insertion of script tags (or SQL) is only a problem if you fail to ensure it isn't at the point that it could be a problem.
A <script>
tag in the middle of a comment that somebody submits will not hurt your server and it won't hurt your database. What it would hurt, if you fail to take appropriate measures, would be a page that includes the comment when you subsequently serve it up and it reaches a client browser. In order to prevent that from happening, your code that prepares the page must make sure that user-supplied content is always scrubbed before it is exposed to an unaware interpreter. In this case, that unaware interpreter is a client web browser. In fact, your client web browser really involves two unaware interpreters: the HTML parser & layout engine and the Javascript interpreter.
Another important example of an unaware interpreter is your database server. Note that a <script>
tag is (almost certainly) harmless to your database, because "" doesn't mean anything in SQL. It's other sorts of input that cause problems for SQL, like quotes in strings (which are harmless to your HTML pages!).
Stackoverflow would be pretty lame if I couldn't put <script>
tags in my answers, as I'm doing now. Same goes for examples of SQL Injection attacks. Recently somebody linked a page from some prominent US bank, where a big <textarea>
was footnoted by a warning not to include the characters "<" or ">" in whatever you typed. Predictably, the bank was ridiculed over hundreds of Reddit comments, and rightly so.
Exactly how you "scrub" user-supplied content depends on the unaware interpreter to which you're delivering it. If it's going to be dropped in the middle of HTML markup, then you have to make sure that the "<", ">", and "&" characters are all encoded as HTML entitites. (You might want to do quote characters too, if the content might end up in an HTML element attribute value.) If the content is to be dropped into Javascript, however, you may not need to worry about HTML escaping, but you do need to worry about quotes, and possibly Unicode characters outside the 7-bit range.
Upvotes: 0
Reputation: 29267
I don't think your question is about validating user input, as others pointed out. You don't want to provide your JSON api to other people... right?
If this is the case then there isn't much you can do... in fact, even if you were serving HTML instead of JSON, people would still be doing HTML scraping to get what they wanted from your site (this is how Search Engine spiders work).
A good way to prevent scraping is to allow only a specific amount of downloads from an IP address. This way if someone is requesting http://yoursite.com/somejson.json
more than 100 times a day, you probably know it's a scraper, and not someone visiting your page for 100 times in 1 day.
Upvotes: 0