saromba
saromba

Reputation: 498

Python List to PostgreSQL Array

I have a list:

[u'ABC', u'DEF', u'GHI']

I have to insert it into a postgresql array: (ALTER TABLE "aTable" ADD COLUMN "Test" text[];)

The syntax for adding data to the postgresql is:

update "aTable" SET "Test" = '{"ABC", "DEF", "GHI"}'

How can I convert the list to the correct format?

Upvotes: 14

Views: 37802

Answers (4)

exhuma
exhuma

Reputation: 21727

Note that with psycopg2 you don't need to do any string processing for arrays. This is considered as bad practice as it is error-prone and can - in the worst case - lead to opening up injection attacks! You should always use bound parameters. In the code below, I will create a new table with only one column with the type TEXT[] (as in your original question). Then I will add a new row, and update all of them. So you will see both an INSERT and UPDATE operation (although both are pretty much identical).

There is one Python gotcha though if you update with only one value: cur.execute expects the SQL statement as first argument and an iterable containing the parameters to be bound as second argument. The following will not work:

from psycopg2 import connect

conn = connect('dbname=exhuma')
cur = conn.cursor()
stmt = 'UPDATE foo SET example_value=%s'
new_values = ['a', 'b', 'c']
cur.execute(stmt, (new_values))
conn.commit()

The reason is that (new_values) is seen by python as new_values (the parens are dropped in this case, they are not seen as tuple). This will result in the error that you supply 3 values ('a', 'b' and 'c') as values to be bound, but there's only one placeholder (%s) in the query. Instead you must specify it as follows (notice the added comma at the end):

from psycopg2 import connect

conn = connect('dbname=exhuma')
cur = conn.cursor()
stmt = 'UPDATE foo SET example_value=%s'
new_values = ['a', 'b', 'c']
cur.execute(stmt, (new_values,))
conn.commit()

This will make Python see (new_values,) as a tuple (which is an iterable) with one element, which matches the query place-holders. For a more detailed explanation of the trailing comma, see the official docs on tuples.

Alternatively, you could also write [new_values] instead of (new_values,), but - in my opinion - (new_values,) is cleaner as tuples are immutable, whereas lists are mutable.


Here's the table with which I tested:

CREATE TABLE foo (
    values TEXT[]
);

And here's Python code both inserting and updating values:

from psycopg2 import connect


conn = connect('dbname=exhuma')
cur = conn.cursor()

cur.execute('INSERT INTO foo VALUES (%s)', (['a', 'b'], ))

print('>>> Before update')
cur.execute('SELECT * FROM foo')
for row in cur:
    print(type(row[0]), repr(row[0]))

print('>>> After update')

cur.execute('UPDATE foo SET example_values = %s',
            (['new', 'updated', 'values'],))

cur.execute('SELECT * FROM foo')
for row in cur:
    print(type(row[0]), repr(row[0]))

cur.close()
conn.commit()
conn.close()

On each execution, the code will insert a new row with the same array values, then execute an update without WHERE clause, so all values are updated. After a couple of executions, I this gives the following output:

>>> Before update
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['a', 'b']")
>>> After update
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")
(<type 'list'>, "['new', 'updated', 'values']")

Upvotes: 23

Peter Eisentraut
Peter Eisentraut

Reputation: 36729

You just pass that list as a parameter to execute. You don't need to do anything special. Psycopg converts Python lists into a suitable PostgreSQL array literal.

import psycopg2 as dbapi

conn = dbapi.connect(dsn='')
c = conn.cursor()
x = [u'ABC', u'DEF', u'GHI']
c.execute('update "aTable" SET "Test" = %s', [x])

Upvotes: 11

alko
alko

Reputation: 48317

Proper way of handling variables in SQL is usage of bind variables. To my knowledge, this is crucial for MySQL and Oracle DB, and I believe for PostgreSQL as well.

So better code would be along the lines

def as_array(l):
    l2str = ','.join('"{}"'.format(x) for x in l)
    return '{{{}}}'.format(l2str)

query = '''update "aTable" SET "Test" = %s'''
lst = [u'ABC', u'DEF', u'GHI']

cur = conn.cursor()
cur.execute(query, as_array(lst))

where conn is a connection to PostgreSQL db.

Upvotes: -1

Atul Arvind
Atul Arvind

Reputation: 16743

list = [u'ABC', u'DEF', u'GHI']
list = str(map(str, list))

list = list.replace('[', '{').replace(']', '}').replace('\'', '\"')

query = '''update "aTable" SET "Test" = '%s\'''' %(list)
print query

Will result in to,

update "aTable" SET "Test" = '{"ABC", "DEF", "GHI"}'

Upvotes: 0

Related Questions