Reputation: 1049
Using SQLAlchemy on PostgreSQL, I try to improve performance at insertion (about 100k egdes to insert), executing "nested inserts" in a single query for one edge and its nodes.
Using Insert.from_select
, I get following error and I don't really understand why.
CompileError: bindparam() name 'name' is reserved for automatic usage in the VALUES or SET clause of this insert/update statement. Please use a name other than column name when using bindparam() with insert() or update() (for example, 'b_name').
from sqlalchemy import *
metadata = MetaData()
node = Table('node', metadata,
Column('id', Integer, primary_key=True),
Column('name', String),
)
edge = Table('edge', metadata,
Column('id', Integer, primary_key=True),
Column('name', String),
Column('source_id', Integer(), ForeignKey(node.c.id)),
Column('target_id', Integer(), ForeignKey(node.c.id)),
)
engine = create_engine('postgres://postgres:postgres@db:5432')
metadata.create_all(engine)
e1_source = insert(node).values(name='e1_source').returning(node.c.id).cte('source')
e1_target = insert(node).values(name='e1_target').returning(node.c.id).cte('target')
e1 = insert(edge).from_select(
['source_id', 'target_id', 'name'], # bindparam error
# ['source_id', 'target_id', 'b_name'], # key error
# [edge.c.source_id, edge.c.target_id, edge.c.name], # bindparam error
select([
e1_source.c.id,
e1_target.c.id,
literal('e1'),
])
)
engine.execute(e1)
EDIT: Below, the SQL query I was expected to produces. I remain open to any suggestions to achieve my purpose though.
CREATE TABLE node (
id SERIAL PRIMARY KEY,
name VARCHAR
);
CREATE TABLE edge (
id SERIAL PRIMARY KEY,
source_id INTEGER REFERENCES node (id),
target_id INTEGER REFERENCES node (id),
name VARCHAR
);
WITH source AS (
INSERT INTO node (name)
VALUES ('e1_source')
RETURNING id
), target as (
INSERT INTO node (name)
VALUES ('e1_target')
RETURNING id
)
INSERT INTO edge (source_id, target_id, name)
SELECT source.id, target.id, 'e1'
FROM source, target;
Upvotes: 0
Views: 1220
Reputation: 1049
I have finally figured out where bindparam
was implicitly used by SQLAlchemy to solve my issue: in the node queries and not the edge query as I was first thinking.
But I am still not sure if this is the proper way to perform nested insert queries with SQLAlchemy and if it will improve execution time.
e1_source = insert(node).values(name=bindparam('source_name')).returning(node.c.id).cte('source')
e1_target = insert(node).values(name=bindparam('target_name')).returning(node.c.id).cte('target')
e1 = insert(edge).from_select(
['source_id', 'target_id', 'name'],
select([
e1_source.c.id,
e1_target.c.id,
literal('e1'),
])
)
engine.execute(e1, {
'source_name': 'e1_source',
'target_name': 'e1_target',
})
Upvotes: 2