Skip to content

Latest commit

 

History

History
89 lines (62 loc) · 3.39 KB

introduction.md

File metadata and controls

89 lines (62 loc) · 3.39 KB

Introduction

Modifications to the PyCDS schema definition are managed using Alembic, a database migration management tool based on SQLAlchemy.

In short, Alembic supports and disciplines two processes of database schema change:

  1. Creation of database migration scripts (Python programs) that modify the schema of a database.

  2. Application of migrations to specific database instances.

  • In particular, Alembic can be used to create a new instance of a modelmeta database by migrating an empty database to the current state. This is described in detail below.

For more information, see the Alembic tutorial.

Specifying the database to operate on

We have customized the Alembic environment manager (alembic/env.py) so that it is possible to operate on any of an arbitrary number of databases defined in alembic.ini, according to an alembic command line argument. This argument takes the form

alembic -x db=<db-label> ...

and it must always be included in any alembic command.

The database to operate upon is specified by <db-label> (e.g., dev), and the meaning of that name is itself specified in alembic.ini. For any db-name you use, a corresponding entry in alembic.ini must exist, of the following form:

[<db-label>]
sqlalchemy.url = <DSN>

For example:

[dev]
sqlalchemy.url = postgresql://tester@localhost:30599/pycds_test

The file alembic.ini already contains several such db-names. We expect to expand this list as more CRMP-type databases are added to the PCIC stable.

Environment variables affecting Alembic operations

The following environment variables affect Alembic operations:

  • PYCDS_SCHEMA_NAME: Name of the schema within the database to be targeted by Alembic operations.
    • Default if not specified: crmp.
  • PYCDS_SU_ROLE_NAME_: Name of role to use for high-privilege operations (e.g., creating functions using untrusted languages such as plypythonu).
    • Default if not specified: pcicdba.
    • Such a role is typically only enabled for the usual lower-privileged user for a limited period while revisions requiring the higher privilege are executed, and not otherwise.
    • Most revisions do not require higher privilege, so this is moot for migrations that do not include them.

All Alembic commands therefore have the generic form

[PYCDS_SCHEMA_NAME=<schema name>] [PYCDS_SU_ROLE_NAME=<role name>] alembic -x db=<db-label> ...

But note that default values and infrequent need for high privilege mean that both options may be omitted.

Specifying the schema within the database

PyCDS enables the user to specify the schema (i.e., named collection of tables, etc.) to be operated upon using the environment variable PYCDS_SCHEMA_NAME.

This has been accomplished in two ways:

  • By creating a modified Alembic environment that uses the specified schema name (see env.py#run_migrations_online() and env.py#run_migrations_offline(); specifically, context.configure(version_table_schema=target_metadata.schema)).
  • By using the specified schema name in all migrations. See note below.

IMPORTANT: Autogenerated migrations must be edited to replace the specific schema name (e.g., schema='crmp') with the specified schema name (schema=pycds.get_schema_name()) wherever it occurs.