Reputation: 24921
I've just realized that psycopg2
allows multiple queries in one execute
call.
For instance, this code will actually insert two rows in my_table
:
>>> import psycopg2
>>> connection = psycopg2.connection(database='testing')
>>> cursor = connection.cursor()
>>> sql = ('INSERT INTO my_table VALUES (1, 2);'
... 'INSERT INTO my_table VALUES (3, 4)')
>>> cursor.execute(sql)
>>> connection.commit()
Does psycopg2
have some way of disabling this functionality? Or is there some other way to prevent this from happening?
What I've come so far is to search if the query has any semicolon (;
) on it:
if ';' in sql:
# Multiple queries not allowed!
But this solution is not perfect, because it wouldn't allow some valid queries like:
SELECT * FROM my_table WHERE name LIKE '%;'
EDIT: SQL injection attacks are not an issue here. I do want to give to the user full access of the database (he can even delete the whole database if he wants).
Upvotes: 1
Views: 5657
Reputation: 365915
If you want a general solution to this kind of problem, the answer is always going to be "parse format X, or at least parse it well enough to handle your needs".
In this case, it's probably pretty simple. PostgreSQL doesn't allow semicolons in the middle of column or table names, etc.; the only places they can appear are inside strings, or as statement terminators. So, you don't need a full parser, just one that can handle strings.
Unfortunately, even that isn't completely trivial, because you have to know the rules for what counts as a string literal in PostgreSQL. For example, is "abc\"def"
a string abc"def
?
But once you write or find a parser that can identify strings in PostgreSQL, it's easy: skip all the strings, then see if there are any semicolons left over.
For example (this is probably not the correct logic,* and it's also written in a verbose and inefficient way, just to show you the idea):
def skip_quotes(sql):
in_1, in_2 = False, False
for c in sql:
if in_1:
if c == "'":
in_1 = False
elif in_2:
if c == '"':
in_2 = False
else:
if c == "'":
in_1 = True
elif c == '"':
in_2 = True
else:
yield c
Then you can just write:
if ';' in skip_quotes(sql):
# Multiple queries not allowed!
If you can't find a pre-made parser, the first things to consider are:
find
will work, do that.re
.* This is correct for a dialect that accepts either single- or double-quoted strings, does not escape one quote type within the other, and escapes quotes by doubling them (we will incorrectly treat 'abc''def'
as two strings abc
and def
, rather than one string abc'def
, but since all we're doing is skipping the strings anyway, we get the right result), but does not have C-style backslash escapes or anything else. I believe this matches sqlite3 as it actually works, although not sqlite3 as it's documented, and I have no idea whether it matches PostgreSQL.
Upvotes: 1
Reputation: 880269
Allowing users to make arbitrary queries (even single queries) can open your program up to SQL injection attacks and denial-of-service (DOS) attacks. The safest way to deal with potentially malicious users is to enumerate exactly what what queries are allowable and only allow the user to supply parameter values, not the entire SQL query itself.
So for example, you could define
sql = 'INSERT INTO my_table VALUES (%s, %s)'
args = [1, 2] # <-- Supplied by the user
and then safely execute the INSERT statement with:
cursor.execute(sql, args)
This is called parametrized SQL because the sql uses %s
as parameter placemarkers, and the cursor.execute
statement takes two arguments. The second argument is expected to be a sequence, and the database driver (e.g. psycopg2) will replace the parameter placemarkers with propertly quoted values supplied by args
.
This will prevent SQL injection attacks. The onus is still on you (when you write your allowable SQL) to prevent denial-of-service attacks. You can attempt to protect yourself from DOS attacks by making sure the arguments supplied by the user is in a reasonable range, for instance.
Upvotes: 0