dtheodor
dtheodor

Reputation: 5060

Alembic support for multiple Postgres schemas

How can I use Alembic's --autogenerate to migrate multiple Postgres schemas that are not hard-coded in the SQL Alchemy model? (mirror question of SQLAlchemy support of Postgres Schemas, but for Alembic).

In particular, we use Postgres schemas to seperate different clients that share the same set of tables. Moreover, there is a schema with shared stuff among clients. The SQL Alchemy model has no knowledge of schemas, the schema is set at run-time using session.execute("SET search_path TO client1,shared").

The default --autogenerate is not helping at all, as it is detecting multiple schemas that do not exist in the model and ends up deleting the schemas and re-creating every table in the default schema.

I would really like to use --autogenerate though, with the proper plumbing to set the schemas correctly. Any suggestions on if/how Alembic's API can do this?

Upvotes: 22

Views: 15681

Answers (5)

Maksim Leali
Maksim Leali

Reputation: 27

I may be late, but i ran into this issue just this week. Some tried using the include_object, but they usually just try to see if the object of type table has the right schema, like this

in the context

 context.configure(
            include_schemas=True,
            connection=connection, 
            include_object=include_object,
            target_metadata=target_metadata,
            version_table_schema= target_metadata.schema
def include_object(object, name, type_, reflected, compare_to):
    # Only include tables in the 'msw' schema
    if(type_ == 'table' and object.schema == target_metadata.schema): 
        return True
        
    return False

but this leave out every column, so you need to add a check on the type_column tooo, like this

def include_object(object, name, type_, reflected, compare_to):
    # Only include tables in the 'msw' schema
    if(type_ == 'table' and object.schema == target_metadata.schema): 
        return True
        
    if(type_ == 'column' and object.table.schema == target_metadata.schema ): 
        return True

    return False

Upvotes: 0

Ghasem
Ghasem

Reputation: 15563

This is how I added multiple schemas to my models and Alembic files: This is how a model looks like:

class A(Base):
      __tablename__ = 'a'
  
      id = Column(Integer, primary_key=True)
      foo = Column(Integer)
  
      __table_args__ = {
          "schema": "Schema1"
      }

class B(Base):
      __tablename__ = 'b'
  
      id = Column(Integer, primary_key=True)
      foo = Column(Integer)
  
      __table_args__ = {
          "schema": "Schema2"
      }

And then in env.py:

target_metadata = models.Base.metadata
all_tenants = ['schema1', 'schema2']

def do_run_migrations(connection: Connection) -> None:
    for tenant_schema_name in all_tenants:
        connection.execute(text(f"CREATE SCHEMA IF NOT EXISTS {tenant_schema_name}"))
        # set search path on the connection, which ensures that
        # PostgreSQL will emit all CREATE / ALTER / DROP statements
        # in terms of this schema by default
        connection.execute(text('set search_path to "%s"' % tenant_schema_name))
        # in SQLAlchemy v2+ the search path change needs to be committed
        connection.commit()

        # make use of non-supported SQLAlchemy attribute to ensure
        # the dialect reflects tables in terms of the current tenant name
        connection.dialect.default_schema_name = tenant_schema_name

        context.configure(
            connection=connection,
            target_metadata=target_metadata,
            include_schemas=True,
            version_table_schema=tenant_schema_name
        )

        with context.begin_transaction():
            context.run_migrations()

Upvotes: 1

Tex Andersen
Tex Andersen

Reputation: 43

This is a pretty old question, but for anyone who ran into the same problem, I solved it by using the following in my env.py:

def run_migrations_online():
"""Run migrations in 'online' mode.

In this scenario we need to create an Engine
and associate a connection with the context.

"""
connectable = engine_from_config(
    config.get_section(config.config_ini_section),
    prefix='sqlalchemy.',
    poolclass=pool.NullPool)

with connectable.connect() as connection:
    context.configure(
        connection=connection,
        target_metadata=target_metadata,
        include_schemas=True,
        version_table_schema=build  # <-- This is the relevant line
    )

    with context.begin_transaction():
        context.run_migrations()

where build is a string that defines the desired schema name. My use case is slightly different (multiple distributed builds with a single database containing multiple identical schemas), however I was running into the same problem of alembic not correctly detecting the schema I was attempting to connect to.

I use environment variables to determine the correct build, as it works quite well with Zappa.

Upvotes: 3

zzzeek
zzzeek

Reputation: 75117

from issue 409, the application to tenant-specific schemas of upgrade/downgrade operations can most easily be done using translated schema names, which is also how you would normally be doing the main application as well for multi-tenant.

Go into env.py:

def run_migrations_online():

    connectable = engine_from_config(
        config.get_section(config.config_ini_section),
        prefix='sqlalchemy.',
        poolclass=pool.NullPool)

    with connectable.connect() as connection:
        for tenant_schema_name in all_my_tenant_names:
             conn = connection.execution_options(schema_translate_map={None: tenant_schema_name}

            logger.info("Migrating tenant schema %s" % tenant_schema_name)
            context.configure(
                connection=conn,
                target_metadata=target_metadata
            )

            # to do each tenant in its own transaction.
            # move this up to do all tenants in one giant transaction
            with context.begin_transaction():
                context.run_migrations()

Above will translate the "None" schema name into the given tenant name. If the application shares tenant-based schemas with a default schema that has global tables, then you'd be using some token like "tenant_schema" as the symbol:

for tenant_schema_name in all_my_tenant_names:
     conn = connection.execution_options(schema_translate_map={"tenant_schema": tenant_schema_name}

and in migration files refer to "tenant_schema" where the actual tenant-specific schema name goes:

def upgrade():
    op.alter_column("some_table", "some_column", <migration options>, schema="tenant_schema")

For the "autogenerate" case, the solution @nick-retallack provides has some more of the pieces you would use on that end, namely the use of include_schemas so that autogenerate looks only at a "specimen" schema that represents the latest version of the tenant-specific schema.

In order to set up env.py to use the right system for the right command, the behaviors can be controlled using user-defined options with migration_context.get_x_argument().

Upvotes: 10

Nick Retallack
Nick Retallack

Reputation: 19561

Here's a workable solution: https://gist.github.com/nickretallack/bb8ca0e37829b4722dd1

It still requires you to edit the schema name out of each migration after it's generated, but at least the interesting work is done by Alembic. Thanks to Michael Bayer for help with this on the mailing list.

Upvotes: 1

Related Questions