mcfar
mcfar

Reputation: 137

Inserting large text data into postgres with python

I am trying to do a bulk insert of long xml strings as text into a postgresql 9.1 database. I am using Python 3.2 and pyscopg2. I am enclosing the xml string in $$ and using a named variable in the query string. For example:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
         $$%(objMetaString)s$$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"

I then construct a dictionary object as follows:

dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}

passing in the objIdent and objMetaString values. I do the insert with the following code:

dbCursor.execute(query, dataDict)

When it inserts the objMetaString value into the database it contains single quotes around the string. If I append the values into the query string and execute the insert without the named variable it does not. For example:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %s), \
         $$%s$$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))" % (objIdent, objMetaString)

and the insert:

dbCursor.execute(query)

My question is how to do a bulk insert of large text data using named variables and $$. I don't really want to have to either pre or post process this string if possible since they may be large and contain an unknown number of either single quotes, or other symbols that will need to be delimited. I have read the following documentation and searched on stackoverflow for the answer, but have not found the solution:

Upvotes: 2

Views: 2225

Answers (1)

Peter Eisentraut
Peter Eisentraut

Reputation: 36729

Summarizing the comment thread. Do this:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
         %(objMetaString)s, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"

dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}

dbCursor.execute(query, dataDict)

Don't put quotes around the %(objMetaString)s placeholder in your query. It's the driver's job to quote the value if necessary.

Upvotes: 1

Related Questions