sqram
sqram

Reputation: 7201

How do you sanitize your data?

This is the function i currently use(from a php book i bought):

function escape($data) {
    return mysql_real_escape_string(trim($data), $this->linkid);    
}

But I feel like it could be safer. for example, maybe use htmlspecialchars. It always makes me paranoid. I've read that mysql_real_escape_string is bad and never to use it, but then i've also read it's the best way. Lots of confusion regarding data sanitizing when inserting them to the database.

So how do you do it? and what are the pros and cons of the way you do it.

Upvotes: 3

Views: 900

Answers (8)

PartialOrder
PartialOrder

Reputation: 2960

  1. In general, use filter_var()

  2. In cases where only very specific formats or values are allowed it may be better to use regexes or in_array() of valid values.

  3. Remember that "input" means any source of input that you don't directly control.

  4. If the input is going into a query, use prepared statements (e.g., mysqli)

Upvotes: 0

gahooa
gahooa

Reputation: 137252

Lets have a quick review of WHY escaping is needed in different contexts:

If you are in a quote delimited string, you need to be able to escape the quotes. If you are in xml, then you need to separate "content" from "markup" If you are in SQL, you need to separate "commands" from "data" If you are on the command line, you need to separate "commands" from "data"

This is a really basic aspect of computing in general. Because the syntax that delimits data can occur IN THE DATA, there needs to be a way to differentiate the DATA from the SYNTAX, hence, escaping.

In web programming, the common escaping cases are: 1. Outputting text into HTML 2. Outputting data into HTML attributes 3. Outputting HTML into HTML 4. Inserting data into Javascript 5. Inserting data into SQL 6. Inserting data into a shell command

Each one has a different security implications if handled incorrectly. THIS IS REALLY IMPORTANT! Let's review this in the context of PHP:

  1. Text into HTML: htmlspecialchars(...)

  2. Data into HTML attributes htmlspecialchars(..., ENT_QUOTES)

  3. HTML into HTML Use a library such as HTMLPurifier to ENSURE that only valid tags are present.

  4. Data into Javascript I prefer json_encode. If you are placing it in an attribute, you still need to use #2, such as

  5. Inserting data into SQL Each driver has an escape() function of some sort. It is best. If you are running in a normal latin1 character set, addslashes(...) is suitable. mysql_real_escape_string() is better. Don't forget the quotes AROUND the addslashes() call:

    "INSERT INTO table1 SET field1 = '" . addslashes($data) . "'"

  6. Data on the command line escapeshellarg() and escapeshellcmd() -- read the manual

-- Take these to heart, and you will eliminate 95%* of common web security risks! (* a guess)

Upvotes: 4

markh
markh

Reputation: 793

There actually is a "universal answer" for the metaproblem (safely storing user-provided data into a database) which is this: If you're not using bind parameters to avoid the whole injection issue to begin with, you're doing it wrong.

Cleaning data is a great idea, but the chance you'll miss something is high. So, whatever other methods you use (and Jani is right, it depends on the data), please don't neglect using bind variables.

Passed data should never hit a query without being bound.

Upvotes: 1

Chris Tonkinson
Chris Tonkinson

Reputation: 14459

After having made sure the data was valid and/or well-formed (see Jani Hartikainen's comment), you really only need a call to PHP's built-in addslashes().

Upvotes: -2

Maciej Łebkowski
Maciej Łebkowski

Reputation: 3887

Sanitaze your data only before you put it in a sensitive context, like:

  • part of SQL query
  • part of filename or path
  • part of a shell command
  • part of HTML output (or any other output, like CSV, XML, ATOM, etc, etc)

Don’t use one generic escape function, because you’ll then have the feeling that the data is safe — but it isn’t. It’s safety depends on the context. And clearly you cannot do all the escaping at once, undependent of all situations you can be using the data. So keep the raw data in database (and yes, use mysql_real_escape_string() or some kind of parameter binding, using PDO for example) and use specific escaping function when putting into context:

  • htmlspecialchars() when in HTML context
  • escape_shell_arg() and escape_shell_cmd() when in shell command context
  • etc, etc

Upvotes: 0

patros
patros

Reputation: 7819

Use SOAP? Har har.

(disclaimer: yes, this is a joke)

Upvotes: 0

ceejayoz
ceejayoz

Reputation: 179994

You're talking about two different types of escaping.

mysql_real_escape_string() escapes data so it'll be safe to send to MySQL.

htmlspecialchars() escapes data so it'll be safe to send to something that renders HTML.

Both work fine for their respective purposes, but parameterized queries via something like mysqli are quite a bit neater.

Upvotes: 4

Jani Hartikainen
Jani Hartikainen

Reputation: 43243

There is no universal answer. It should always depend on what the data is that you're storing.

  • Is it supposed to be a number? Then run it through is_numeric (or such)
  • Is it a string that's not allowed to contain HTML? Use htmlentities
  • etc.

Running all data through mysql_real_escape_string is a good idea. Of course this also depends on whether your code is using a DB library or PDO or something else.

For example, with PDO, instead of the mysql function, you would want to use $pdo->quote, or with Zend_Db's statements, nothing as it escapes things automatically for you.

Upvotes: 1

Related Questions