Reputation: 2960
I've built a Web UI to serve as an ETL application that allows users to select some CSV and TSV files contain large amounts of records and I am attempting to insert them into a PostgreSQL database. As has already been well commented on, this process is kind of slow. After some research it looked like using the UNNEST function would be my answer but I'm having trouble implementing it. Honestly I just didn't find a great walk-through tutorial as I normally do when researching any data processing in Python.
Here's the SQL string as I store them (to be used in functions later):
salesorder_write = """
INSERT INTO api.salesorder (
site,
sale_type,
sales_rep,
customer_number,
shipto_number,
cust_po_number,
fob,
order_number
) VALUES (
UNNEST(ARRAY %s)
"""
I use this string along with a list of tuples like so:
for order in orders:
inputs=(
order['site'],
order['sale_type'],
order['sales_rep'],
order['customer_number'],
order['shipto_number'],
order['cust_po_number'],
order['fob'],
order['order_number']
)
tup_list.append(inputs)
cur.execute(strSQL,tup_list)
This gives me the error that Not all arguments converted during string formatting
. My first question is How do I need to structure my SQL to be able to pass my list of tuples. My second is, can I use the existing dictionary structure in much the same way?
Upvotes: 2
Views: 1772
Reputation: 125414
unnest
is not superior to the now (since Psycopg 2.7) canonical execute_values
:
from psycopg2.extras import execute_values
orders = [
dict (
site = 'x',
sale_type = 'y',
sales_rep = 'z',
customer_number = 1,
shipto_number = 2,
cust_po_number = 3,
fob = 4,
order_number = 5
)
]
salesorder_write = """
insert into t (
site,
sale_type,
sales_rep,
customer_number,
shipto_number,
cust_po_number,
fob,
order_number
) values %s
"""
execute_values (
cursor,
salesorder_write,
orders,
template = """(
%(site)s,
%(sale_type)s,
%(sales_rep)s,
%(customer_number)s,
%(shipto_number)s,
%(cust_po_number)s,
%(fob)s,
%(order_number)s
)""",
page_size = 1000
)
Upvotes: 1