Reputation: 50190
Please don't reach for the "duplicate" gun just yet.
I need to generate a series of SQL statements involving literal strings that contain the occasional single quote. Yeah, I know that parametrized queries are the way to go. The thing is, I'm not communicating with a database directly: I'm generating an SQL script that will be used to load data on another computer. So, I don't mind issuing parametrized queries to my local database (mysql), but I'll need to output the complete SQL commands in text form. Can I do that in python? The only suggestions I saw on SO are hacks like using repr()
or json.dumps()
, or specific to psycopg. Surely that can't be all there is?
This application will never deal with untrusted data, so although I'd like the most robust solution possible, I'm not too worried about malformed unicode attacks and the like. What's the best way to do this?
Upvotes: 0
Views: 278
Reputation: 4177
While generating complex SQLs may have it's difficulties, I love python for straight INSERT
s. No tools, no libraries, just plain python solves all issues out of the box:
# syntactic sugar
def sq(s):
return s.replace("'", "''")
# a third kind of string delimiters :3
sql_template = """INSERT INTO mytab ("the key", f1, f2)
VALUES (%d, '%s', '%s');"""
the_key = 7
f1 = "Hello World"
f2 = "That's ok"
# escape only string colums
sql = sql_template % (the_key, sq(f1), sq(f2))
print sql
Upvotes: 1
Reputation: 7384
You can subclass the psycopg2 cursor class to use mogrify to write the queries to a file instead of executing them against a database. You could probably also use it directly (and save you setting up a database etc.).
You could also use the query attribute for "playback" of a session.
Upvotes: 1
Reputation: 12214
The process for producing SQL is the same for any language. You need to first understand the entirety of the language (or data format; same difference here) that you want to produce. Failure to do this is the primary failing of most attempts: since they don't understand the language they don't handle all the input that can cause a problem. When you know the entirety of the language, including all corner cases and rarely used constructs, then you can create code that can understand all possible inputs and correctly represent them in that language. The easiest solution is to let someone else who has already dealt with it do the hard part, or to sidestep the entire mess by using a different representation. This is why everyone recommends parameterized queries. It pushes the responsibility to someone else who has already solved it, and usually they solve it by using a different representation in the protocol than the SQL language itself.
Upvotes: 0