Reputation: 81

Schema versioning within the Snowflake data warehouse

I am interested in ways in which users of snowflake database can be insulated from change via the use of schema versioning. I have been investigating the use of connection syntax to define a schema where a new schema holding views to the core tables would be created for each release, any views unchanged would be copied others which were amended would be made backwards compatible. As users connect they would ideally be given the correct connection syntax for the version they required.

The problem I have is that there are multiple teams each owning schemas associated with a core business area and I don't think it is possible to define multiple schemas in the connection syntax.

Has anyone achieved this in an environment with multiple users, schemas and development teams?

Regards,

Luke

Upvotes: 8

Answers (4)

jm-nab

Reputation: 74

Alembic and SqlAlchemy seems to be the suggested approach by snowflake.

For example by adding this to the alembic env.py:

from alembic.ddl.impl import DefaultImpl

class SnowflakeImpl(DefaultImpl):
    __dialect__ = 'snowflake'

Will resolve the following error:

    return _impls[dialect.name]
           ~~~~~~^^^^^^^^^^^^^^
KeyError: 'snowflake'

This approach worked.

Reference:

https://docs.snowflake.com/developer-guide/python-connector/sqlalchemy#alembic-support

Upvotes: 0

nofinator

Reputation: 3023

Snowflake Labs has their own Database Change Management tool called schemachange that looks promising. (We haven't implemented it, but are considering it.) It is written in Python and based on Flyway, which is one of the more popular DCM tools available.

When combined with a version control system and a CI/CD tool, database changes can be approved and deployed through a pipeline using modern software delivery practices. As such schemachange plays a critical role in enabling Database (or Data) DevOps.

Upvotes: 0

Brock

Reputation: 284

We use Alembic for our Database version control for Snowflake. Alembic is a "migration" tool where you can run multiple changes (or a migration) to your Data Warehouse. It's essentially an add-on to the SQLAlchemy library in Python.

When developing locally, we create a clone of our database, and test our migration changes to the cloned database. Once we know it works, we push it to GitLab, get it approved, then we can run a CI/CD pipeline that has accountadmin credentials to make the change in production.

Since it's written in Python, you can connect this to your Git tool (like GitHub or GitLab) and submit changes in a Merge Request and get approval before running this in your Production database.

Here's the documentation: https://alembic.sqlalchemy.org/en/latest/

This is also officially supported according to Snowflake documentation: https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html#alembic-support

An example Alembic migration might look like:


Revision ID: 78a3acc7fbb2
Revises: 3f2ee8d809a6
Create Date: 2019-11-06 11:40:38.438468

"""

# revision identifiers, used by Alembic.
revision = '78a3acc7fbb2'
down_revision = '3f2ee8d809a6'
branch_labels = None
depends_on = None

from alembic import op
import sqlalchemy as sa

def upgrade():
    op.create_table('test_table',
    sa.Column('op', sa.String(length=255), nullable=True),
    sa.Column('id', sa.String(length=255), nullable=False),
    sa.Column('amount', sa.BigInteger(), nullable=True),
    sa.Column('reason', sa.String(length=255), nullable=True),
    sa.Column('deleted', sa.Boolean(), nullable=True),
    sa.Column('user_id', sa.Integer(), nullable=True),
    sa.Column('company_id', sa.Integer(), nullable=True),
    sa.Column('inserted_at', sa.DateTime(), nullable=True),
    sa.Column('updated_at', sa.DateTime(), nullable=True),
    sa.Column('dw_import_filename', sa.String(length=255), nullable=True),
    sa.Column('dw_import_file_row_number', sa.Integer(), nullable=True),
    sa.Column('dw_import_timestamp', sa.TIMESTAMP(), nullable=True),
    sa.PrimaryKeyConstraint('id'),
    schema='test_schema'
    )

def downgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.drop_table('test_table', schema='test_schema')

As you can see, you have to supply an upgrade and have the ability to downgrade, which reverses the upgrade. If you have any other questions about Alembic or if this interests you then I'd be happy to explain more.

Upvotes: 3

Chris

Reputation: 690

As the first step in your release process you could CLONE your "current version" schema to a new "x version" schema, then deploy your new schema with the original name.

Example:

CREATE SCHEMA MY_DB.MY_SCHEMA_20200103 CLONE MY_DB.MY_SCHEMA;

... deployment release steps ...

Important Notes

You must be careful with all references to ensure they are pointing the correct object, whether that be an object in the versioned schema or the current schema.
This will copy the grants on all the objects within the schema, but privileges on the newly created schema will need to be granted

Upvotes: 0

Schema versioning within the Snowflake data warehouse

Answers (4)

Example:

Important Notes

Related Questions