Ruslan
Ruslan

Reputation: 951

Recreate table indexes in SQLAlchemy

I have a database that's been created using SQLAlchemy. The engine is postgres, and the normal use-case for the DB is bulk insertions. According to several guides, it was suggested to disable constraints (the insertion process is very well tested), do the bulk insertions, then re-enable the constraints. The reason for that is that creating indices and checking constraints (foreign key etc.) on the fly is hugely computationally expensive. Bulk insertions become expensive, and they get significantly more expensive as the table gets bigger.

The foreign keys constraints are maintained (+easily testable) but I'm not sure about indexes: doing the bulk insertions that way doesn't create the indexes, so they exist "on paper" only. Right?

If so, is there a simple way to either:

  1. A. Recreate all tables' constraints (most importantly, indexes)?
  2. Manually, drop all indexes, then recreate them?
  3. Trigger "update" on indexes?

I prefer to keep solutions within SQLAlchemy, but it there's a simple postgres solution, I'll take it.

EDIT: Ok, it seems there's a simple postgres statement to do that:

conn.execution_options(isolation_level="AUTOCOMMIT").execute(
                       "REINDEX DATABASE my_db_name;")

Is there an "SQLAlchemy" friendly way to do the same?

Example table declaration:

import sqlalchemy as db
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Table
from utils.db_utils import parse_datetime_to_short_date
Base = declarative_base()

class ProductsInfo(Base):
    __table__ = Table('products_info', Base.metadata,
                      db.Column('product_id', db.Integer, primary_key=True, autoincrement=False, 
                                 nullable=False, index=True, unique=True, comment='Unique product-id'),
                      db.Column('link', db.String(255), default=None, comment='Product URL'),
                      )
    def __repr__(self):
        return f"<ProductsInfo(product_id='{self.product_id}', link='{self.link}')>"

Upvotes: 0

Views: 1045

Answers (1)

Ruslan
Ruslan

Reputation: 951

Ended up doing bulk-insertions like so:

  1. Disable tables' triggers (not all tables), using postgres

  2. Bulk-insertions

  3. Enable tables' triggers

  4. Reindex tables

    ALTER TABLE my_table DISABLE TRIGGER ALL

    -- bulk insertions

    ALTER TABLE my_table ENABLE TRIGGER ALL conn.execution_options(isolation_level="AUTOCOMMIT").execute("REINDEX TABLE my_table;")

Upvotes: 1

Related Questions