ABuNeNe
ABuNeNe

Reputation: 29

Adding column in upsert using on_conflict_do_update

I'm trying to add a timestamp column, updated_at, in the updates when doing a upsert using on_conflict_do_update. Below is my current implementation but encounter the below error when running at the last line.

Exception has occurred: ProgrammingError (psycopg2.errors.AmbiguousColumn) column reference "updated_at" is ambiguous LINE 1: ...r, properties = excluded.properties, updated_at = updated_at

Any idea how to implement this?

def postgres_upsert(table, conn, keys, data_iter):
    from sqlalchemy.dialects.postgresql import insert

    data = [dict(zip(keys, row)) for row in data_iter]

    insert_statement = insert(table.table).values(data)

    x = {c.key: c for c in insert_statement.excluded}
    x["updated_at"] = Column('updated_at', TIMESTAMP, default=datetime.datetime.now())

    upsert_statement = insert_statement.on_conflict_do_update(
        constraint=f"{table.table.name}_pkey",
        # set_={c.key: c for c in insert_statement.excluded},
        set_=x,
    )
    conn.execute(upsert_statement)

Upvotes: 0

Views: 81

Answers (2)

Zegarek
Zegarek

Reputation: 26347

If by adding column you mean, to conflict resolution clause, it's already done for you:

the special alias Insert.excluded is available as an attribute on the Insert object; this object is a ColumnCollection which alias contains all columns of the target table

If that table has an updated_at, you already handled it with x = {c.key: c for c in insert_statement.excluded} along with all others.

If the table doesn't have an updated_at and you literally mean you wish to add a completely new column, you'd have to alter the table and I don't think an upsert is a good place to do that.

Another guess would be you're trying to use the updated_at column's default of now(), instead of whatever is currently in the table, also ignoring whatever's coming in under excluded.updated_at. You could pass it as a literal coming in from your app:

x["updated_at"] = datetime.datetime.now()

or use the db's now()

x["updated_at"] = func.now()

The reason you got the error is that you didn't clearly resolve the conflict. You can see how properties are handled:

properties = excluded.properties

This means the column will get the incoming value, overwriting whatever was in the conflicting row in the table earlier. You could've also picked the old value:

properties = your_table.properties

Or a completely different one

properties = 'something else'

The problem goes away if you use func.now() or datetime.now(), but selecting plain updated_at isn't clear as to whether you mean the old one, or the new one that's incoming from the insert.

Upvotes: 1

Midnight_Datatalog
Midnight_Datatalog

Reputation: 1

The AmbiguousColumn error comes from a lack of clarification on the origin of that column. This usually happens when two tables have a column named the same.

Inside the on_conflict_do_update, set_ should be a dictionary or other mapping object where the keys are either names of columns in the target table, or Column object. You are kinda mixing both.

This is how I would do it, hope this helps:

def postgres_upsert(table, conn, keys, data_iter):
    data = [dict(zip(keys, row)) for row in data_iter]
    insert_statement = insert(table.table).values(data)
     x = {
        **{c.key: getattr(insert_statement.excluded, c.key) for c in insert_statement.excluded},
        "updated_at": datetime.now()
    }

    upsert_statement = insert_statement.on_conflict_do_update(
        constraint=f"{table.table.name}_pkey",
        set_=x
    )
    conn.execute(upsert_statement)

Upvotes: 0

Related Questions