RyanM
RyanM

Reputation: 2960

Using Unnest With psycopg2

I've built a Web UI to serve as an ETL application that allows users to select some CSV and TSV files contain large amounts of records and I am attempting to insert them into a PostgreSQL database. As has already been well commented on, this process is kind of slow. After some research it looked like using the UNNEST function would be my answer but I'm having trouble implementing it. Honestly I just didn't find a great walk-through tutorial as I normally do when researching any data processing in Python.

Here's the SQL string as I store them (to be used in functions later):

salesorder_write = """
  INSERT INTO api.salesorder (
    site,
    sale_type,
    sales_rep,
    customer_number,
    shipto_number,
    cust_po_number,
    fob,
    order_number
  ) VALUES (
    UNNEST(ARRAY %s)

"""

I use this string along with a list of tuples like so:

for order in orders:
        inputs=(
            order['site'],
            order['sale_type'],
            order['sales_rep'],
            order['customer_number'],
            order['shipto_number'],
            order['cust_po_number'],
            order['fob'],
            order['order_number']
        )
        tup_list.append(inputs)
cur.execute(strSQL,tup_list)

This gives me the error that Not all arguments converted during string formatting. My first question is How do I need to structure my SQL to be able to pass my list of tuples. My second is, can I use the existing dictionary structure in much the same way?

Upvotes: 2

Views: 1772

Answers (1)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125414

unnest is not superior to the now (since Psycopg 2.7) canonical execute_values:

from psycopg2.extras import execute_values
orders = [
    dict (
        site = 'x',
        sale_type = 'y',
        sales_rep = 'z',
        customer_number = 1,
        shipto_number = 2,
        cust_po_number = 3,
        fob = 4,
        order_number = 5
    )
]
salesorder_write = """
    insert into t (
        site,
        sale_type,
        sales_rep,
        customer_number,
        shipto_number,
        cust_po_number,
        fob,
        order_number
    ) values %s
"""
execute_values (
    cursor,
    salesorder_write,
    orders,
    template = """(
        %(site)s,
        %(sale_type)s,
        %(sales_rep)s,
        %(customer_number)s,
        %(shipto_number)s,
        %(cust_po_number)s,
        %(fob)s,
        %(order_number)s
    )""",
    page_size = 1000
)

Upvotes: 1

Related Questions