Modifications to the PyCDS schema definition are managed using Alembic, a database migration management tool based on SQLAlchemy.
In short, Alembic supports and disciplines two processes of database schema change:
-
Creation of database migration scripts (Python programs) that modify the schema of a database.
-
Application of migrations to specific database instances.
- In particular, Alembic can be used to create a new instance of a
modelmeta
database by migrating an empty database to the current state. This is described in detail below.
For more information, see the Alembic tutorial.
We have customized the Alembic environment manager (alembic/env.py
) so that it is
possible to operate on any of an arbitrary number of databases defined in alembic.ini
,
according to an alembic
command line argument. This argument takes the form
alembic -x db=<db-label> ...
and it must always be included in any alembic
command.
The database to operate upon is specified by <db-label>
(e.g., dev
), and the meaning of that name is itself specified
in alembic.ini
. For any db-name you use, a corresponding entry in alembic.ini
must exist, of the following
form:
[<db-label>]
sqlalchemy.url = <DSN>
For example:
[dev]
sqlalchemy.url = postgresql://tester@localhost:30599/pycds_test
The file alembic.ini
already contains several such db-names.
We expect to expand this list as more CRMP-type databases are added to the PCIC stable.
The following environment variables affect Alembic operations:
PYCDS_SCHEMA_NAME
: Name of the schema within the database to be targeted by Alembic operations.- Default if not specified:
crmp
.
- Default if not specified:
PYCDS_SU_ROLE_NAME_
: Name of role to use for high-privilege operations (e.g., creating functions using untrusted languages such asplypythonu
).- Default if not specified:
pcicdba
. - Such a role is typically only enabled for the usual lower-privileged user for a limited period while revisions requiring the higher privilege are executed, and not otherwise.
- Most revisions do not require higher privilege, so this is moot for migrations that do not include them.
- Default if not specified:
All Alembic commands therefore have the generic form
[PYCDS_SCHEMA_NAME=<schema name>] [PYCDS_SU_ROLE_NAME=<role name>] alembic -x db=<db-label> ...
But note that default values and infrequent need for high privilege mean that both options may be omitted.
PyCDS enables the user to specify the schema (i.e., named collection of tables, etc.) to be operated upon
using the environment variable PYCDS_SCHEMA_NAME
.
This has been accomplished in two ways:
- By creating a modified Alembic environment that uses the specified schema name
(see
env.py#run_migrations_online()
andenv.py#run_migrations_offline()
; specifically,context.configure(version_table_schema=target_metadata.schema)
). - By using the specified schema name in all migrations. See note below.
IMPORTANT: Autogenerated migrations must be edited to replace the specific schema name (e.g., schema='crmp'
)
with the specified schema name (schema=pycds.get_schema_name()
) wherever it occurs.