Boris Däppen
Boris Däppen

Reputation: 1196

Reasons for Prepared Statements with Bind Parameters over Interpolated Statements with Escaped / Quoted Parameters

To protect against SQL injection one is advised to use prepared statements with bind values. This ensures, that the database can distinguish between the actual logic in the SQL (which has to be parsed, interpreted and optimized) and the data (which doesn't need interpretation) and therefore will not interpret and execute commands which are found in data.

Another method accomplishing some protection is using an escaping library, which disarms significant chars in the data, so that they will not be interpreted.

It seems to me that in general it is advised to prefer prepared statements with bind parameters over escaping the input. Prepared statements with bind values do for example have some performance benefits in loops.

My question: is there any security reason to prefer prepared statements with bind values over escaping? And if yes, what are the exact reasons?

One reason I might think of is that "escaping is tricky" and the escaping library needs to match exactly the database features... anything else?

Upvotes: 5

Views: 1867

Answers (3)

Caius Jard
Caius Jard

Reputation: 74605

Though I have no empirical evidence on the matter that I can point to that proves it's used, it's perhaps worth pointing out that using prepared statements that use a mix of parameter variables and constant values would allow a database query optimizer to know which parts of a query will be varied by the application and which parts will be constant. This could be of use in query planning. If you get into a mode of quoting all your data along with the query then the optimizer can't guess which parts are likely to very and which parts are fixed without keeping history of all variations of the query and looking at differences to work out which parts vary.

--we could infer that name will vary and type will not
--but we'd have to analyze all queries sent to work this out
SELECT * FROM person WHERE type = 1 AND name = 'john'
SELECT * FROM person WHERE type = 1 AND name = 'mark'
SELECT * FROM person WHERE type = 1 AND name = 'luke'


--we can easily say that type will vary and name will too
--the previously seen queries would infer differently
SELECT * FROM person WHERE type = @t AND name = @n

I mentioned I've never read anything that indicates it's used, but one can read about decisions that sql server makes called Parameter Sniffing, whereby it caches and reuses plans based on the first set of values it sees loaded into parameters, which might not generate optimal queries in all cases

Of course; this may harden your resolve to ditch parameters, or use hints to make the db re-plan the query every time, but it would be better to work with the server rather than against it, and use techniques to make it plan based on common or optimal-as-possible values


Even if we aren't getting cute with tweaking plans according to what we know about variables and constants, using a prepared statement should, at the very least, allow the db to compile and then reuse that compilation effort rather than redoing it, lowering the amount of resources that must go into readying a statement for run.

Think about your proposal in front end language terms:

To change the value of a variable action you could either change the runtime value with a simple value assignment in memory:

sayHello(string name){
  console.print("hello " + name);
}

var name = console.readString(),
sayHello(name);

Or you could instead build a whole new script file with the new value in it, write it to disk, invoke the compiler on it, quit the running version of your app and start the newly compiled version:

main(){
  disk.write("sayHello(string name){console.print(\"hello \"" + name +");}", "c:\\temp\\new.lang");
  launchExe("langcompiler.exe", "c:\\temp\\new.lang");
  launchExe("c:\\temp\\new.exe");
}

It's ridiculous to self modify the program and recompile just to change a value used in a function call, right?

Except that's what a db server does with every unparameterized SQL it receives, unless it goes to some effort to work out whether the query it just got is mostly the same as one it got X minutes ago except for some data part, extract that data, plug it into the compilation effort from 5 minutes ago..

Upvotes: 0

Your Common Sense
Your Common Sense

Reputation: 157839

The whole statement of question is coming for one ancient grave delusion

escaping which disarms significant chars in the data

Is, frankly, a nonsense.

  • There are no all-embracing "significant characters". A character that could have a devastating effect on one query part if cut loose, could be as harmless as a lamb in another. And vice versa.
  • There is no abstract all-embracing "data". All query parts are distinct, but escaping works for only one part.
  • And there is no such practice as "using escaping for protection" whatsoever.

Escaping is intended to escape special characters in SQL strings. And never has been intended for any protection. It's just a technological measure that has been awfully misunderstood and mistreated. It is like claiming that we are following a proper syntax in our programs solely for the protection. We are following the proper syntax to make the interpreter / compiler understand our code. Same here. Escaping is used to produce syntactically correct SQL strings. Which are of course injection-proof as a side effect. But again - the mission of escaping is anything but protection.

And here comes the the Escaping problem #1: strings are not the only data types to be used in the query. While using string escaping on the any other data literal is a straight road to disaster.

Moreover, even for strings, escaping is an essentially detachable measure, which alone constitutes a whole can of worms, making your code is prone to human errors of all sorts and constituting the Escaping problem #2:

Citing my article on the matter, Why should I use prepared statements if escaping is safe?:

As you can see, formatting a value for a database is effectively split into two parts, escaping variables and quoting values in the query. And this is where all the magic happens lies the cause for innumerable real life cases of SQL injections.

With your simplified example, where all the code is bound together, it is hard to overlook the proper routine. But in the real life the code is much more complex, consisting of large distinct modules. And escaping is done in one module while quoting in another. Or not. Nobody can tell actually. I'd just trust that this value has been escaped already. Or I will escape it just to be sure, and introduce extra escaping characters in the data. Or I am a new dev, who don't understand the example you posted here, and I was watching a youtube video that said escaping prevents SQL injection. I know the value has been escaped already, so I can put in the query safely. And as it is an integer, why would I waste quotes on it?

Or I know that the data has been escaped already when it was entering the application, so I won't have to escape it during some internal manipulations sometime later (when moving into another table for example). And have a first class second order SQL injection as a result.

Trust me, I've seen all these cases in the wild. Such a separated formatting introduces a total mess and a wast opportunity for injections.

Unlike escaping, prepared statements always make sure that a query part is treated properly.

Upvotes: 2

Bill Karwin
Bill Karwin

Reputation: 562270

One reason is that escaping only works to protect quoted string literals. For example (I'll use pseudocode since you didn't reference any particular programming language):

$escapedName = EscapeString("O'Reilly")

$sql = "SELECT * FROM MyTable WHERE name = '$escapedName'"

In the above example, the apostrophe should be escaped, so it will become WHERE name = 'O\'Reilly' and therefore be safe to interpolate into the SQL query without causing any error.

However, numbers don't need to be quoted in SQL, and escaping a string that contains an apostrophe won't do the right thing:

$escapedId = EscapeString("123'456")

$sql = "SELECT * FROM MyTable WHERE id = $escapedId"

This will result in WHERE id = 123\'456 which is still an error.

You might say, "well put the number in single-quotes" but this isn't always possible, for example the LIMIT clause in MySQL required real integers, not a quoted string containing digits.

Besides the above issue, it's just easier to write code using parameters instead of using escaping!

For example, you could write code like the following:

$sql = "INSERT INTO mytable (col1, col2, col3, col4, col5, col6) 
  VALUES ('" . mysqli_real_escape_string($_POST['col1']) . "', " 
  . $mysqli->real_escape_string($_POST['col2']) . "', '" 
  . $mysqli->real_escape_string($_POST['col3']) . "', '" 
  . $mysqli->real_escape_string($_POST['col4']) . ", '" 
  . $mysqli->real_escape_string($_POST['col5']) . "', '" 
  . $mysqli->real_escape_string($_POST['col6']) . "')";

Can you spot the mistakes? With enough time, I’m sure you can. But it will slow down your coding and may give you eyestrain as you look for missing quote characters and other mistakes.

But it’s so much easier to write this, and easier to read it afterwards:

$sql = "INSERT INTO mytable (col1, col2, col3, col4, col5, col6) 
  VALUES (?, ?, ?, ?, ?, ?)";

Query parameters are safe for more data types, and they help you write code more quickly, with fewer mistakes. That's a big win.

Upvotes: 6

Related Questions