Reputation: 28777
I want to find rows where a text column begins with a user given string, e.g. SELECT * FROM users WHERE name LIKE 'rob%'
but "rob" is unvalidated user input. If the user writes a string containing a special pattern character like "rob_", it will match both "robert42" and "rob_the_man". I need to be sure that the string is matched literally, how would I do that? Do I need to handle the escaping on an application level or is it a more beautiful way?
I'm using PostgreSQL 9.1 and go-pgsql for Go.
Upvotes: 4
Views: 15480
Reputation: 25237
The best answer is that you shouldn't be interpolating user input into your sql at all. Even escaping the sql is still dangerous.
The following which uses go's db/sql library illustrates a much safer way. Substitute the Prepare and Exec calls with whatever your go postgresql library's equivalents are.
// The question mark tells the database server that we will provide
// the LIKE parameter later in the Exec call
sql := "SELECT * FROM users where name LIKE ?"
// no need to escape since this won't be interpolated into the sql string.
value := "%" + user_input
// prepare the completely safe sql string.
stmt, err := db.Prepare(sql)
// Now execute that sql with the values for every occurence of the question mark.
result, err := stmt.Exec(value)
The benefits of this are that user input can safely be used without fear of it injecting sql into the statements you run. You also get the benefit of reusing the prepared sql for multiple queries which can be more efficient in certain cases.
Upvotes: -2
Reputation: 125204
To escape the underscore and the percent to be used in a pattern in like
expressions use the escape character:
SELECT * FROM users WHERE name LIKE replace(replace(user_input, '_', '\\_'), '%', '\\%');
Upvotes: 4
Reputation: 61506
The _ and % characters have to be quoted to be matched literally in a LIKE statement, there's no way around it. The choice is about doing it client-side, or server-side (typically by using the SQL replace(), see below). Also to get it 100% right in the general case, there are a few things to consider.
By default, the quote character to use before _ or % is the backslash (\), but it can be changed with an ESCAPE clause immediately following the LIKE clause. In any case, the quote character has to be repeated twice in the pattern to be matched literally as one character.
Example: ... WHERE field like 'john^%node1^^node2.uucp@%' ESCAPE '^'
would match john%node1^node2.uccp@ followed by anything.
There's a problem with the default choice of backslash: it's already used for other purposes when standard_conforming_strings is OFF (PG 9.1 has it ON by default, but previous versions being still in wide use, this is a point to consider).
Also if the quoting for LIKE wildcard is done client-side in a user input injection scenario, it comes in addition to to the normal string-quoting already necessary on user input.
A glance at a go-pgsql example tells that it uses $N-style placeholders for variables... So here's an attempt to write it in a somehow generic way: it works with standard_conforming_strings both ON or OFF, uses server-side replacement of [%_], an alternative quote character, quoting of the quote character, and avoids sql injection:
db.Query("SELECT * from USERS where name like replace(replace(replace($1,'^','^^'),'%','^%'),'_','^_') ||'%' ESCAPE '^'",
variable_user_input);
Upvotes: 8
Reputation: 28777
As far as I can tell the only special characters with the LIKE operator is percent and underscore, and these can easily be escaped manually using backslash. It's not very beautiful but it works.
SELECT * FROM users WHERE name LIKE
regexp_replace('rob', '(%|_)', '\\\1', 'g') || '%';
I find it strange that there is no such functions shipped with PostgreSQL. Who wants their users to write their own patterns?
Upvotes: 2