Skip to content

Commit

Permalink
Add a central schema yaml file.
Browse files Browse the repository at this point in the history
This contains all the tables and row entries in a central location.

Now the schema creation script, the CLI and the docs can use this, so we don't have to worry about missing somewhere when adding a new entry to a table.
  • Loading branch information
stuartmcalpine committed Dec 2, 2023
1 parent ebbfa5c commit 9edbe08
Show file tree
Hide file tree
Showing 9 changed files with 410 additions and 361 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
run: |
python -m pip install --upgrade pip
python -m pip install .
pip install sphinx sphinx_rtd_theme sphinx_toolbox sphinxcontrib-autoprogram
pip install sphinx sphinx_rtd_theme sphinx_toolbox sphinxcontrib-autoprogram sphinxcontrib.datatemplates
- name: Sphinx build
run: |
sphinx-build docs/source _build
Expand Down
3 changes: 3 additions & 0 deletions docs/source/_static/css/custom.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.tight-table td {
white-space: normal !important;
}
9 changes: 8 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
"sphinx_rtd_theme",
"sphinx.ext.autodoc",
'sphinx.ext.napoleon',
'sphinxcontrib.autoprogram'
'sphinxcontrib.autoprogram',
'sphinxcontrib.datatemplates'
]

project = 'DESC data management'
Expand Down Expand Up @@ -36,3 +37,9 @@
html_logo = '_static/DREGS_logo_v2.png'

autoclass_content = 'both'

templates_path = ['templates']

html_css_files = [
'css/custom.css',
]
247 changes: 2 additions & 245 deletions docs/source/reference_schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,248 +7,5 @@ database (e.g., the default and production schemas) follows the same structure.
.. image:: _static/schema_plot.png
:alt: Image missing

The dataset table
-----------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``dataset_id``
- Unique identifier for dataset
- int
* - ``name``
- User given name for dataset
- str
* - ``relative_path``
- Relative path storing the data, relative to `<root_dir>`
- str
* - ``version_major``
- Major version in semantic string (i.e., X.x.x)
- int
* - ``version_minor``
- Minor version in semantic string (i.e., x.X.x)
- int
* - ``version_patch``
- Patch version in semantic string (i.e., x.x.X)
- int
* - ``version_suffix``
- Optional version suffix
- str
* - ``dataset_creation_date``
- Dataset creation date
- datetime
* - ``is_archived``
- True if the data is archived, i.e, the data is longer within `<root_dir>`
- bool
* - ``is_external_link``
- ???
- bool
* - ``is_overwritten``
- True if the original data for this dataset has been overwritten at some point. This would have required that ``is_overwritable`` was set to ``true`` on the original dataset
- bool
* - ``is_valid``
- ???
- bool
* - ``register_date``
- Date the dataset was registered
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str
* - ``access_API``
- Describes the software that can read the dataset (e.g., "gcr-catalogs", "skyCatalogs")
- str
* - ``execution_id``
- ID of execution this dataset belongs to
- int
* - ``description``
- User provided description of the dataset
- str
* - ``owner_type``
- Datasets owner type, can be "user", "group", "project" or "production".
- str
* - ``owner``
- Owner of the dataset
- str
* - ``data_org``
- Dataset organisation ("file" or "directory")
- str
* - ``nfiles``
- How many files are in the dataset
- int
* - ``total_disk_space``
- Total disk spaced used by the dataset
- float

The dataset_alias table
-----------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``dataset_alias_id``
- Unique identifier for alias
- int
* - ``name``
- User given alias name
- str
* - ``dataset_id``
- ID of dataset this is an alias for
- int
* - ``supersede_date``
- If a new entry has been added to the table with the same alias name (but
different dataset_id), the old entry will be superseded. ``supersede_date``
in the old entry tracks when this happened. If the entry has not been
superseded, ``supersede_date`` will be None
- datetime
* - ``register_date``
- Date the dataset was registered
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str

The dependency table
--------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``dependency_id``
- Unique identifier for dependency
- int
* - ``execution_id``
- Execution this dependency is linked to
- int
* - ``input_id``
- Dataset ID of the dependent dataset
- int
* - ``register_date``
- Date the dependency was registered
- datetime

The execution table
-------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``execution_id``
- Unique identifier for execution
- int
* - ``description``
- User given discription of execution
- str
* - ``name``
- User given execution name
- str
* - ``register_date``
- Date the execution was registered
- datetime
* - ``execution_start``
- Date the execution started
- datetime
* - ``locale``
- Locale of execution (e.g., NERSC)
- str
* - ``configuration``
- Path to configuration file of execution
- str
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str

The execution_alias table
-------------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``execution_alias_id``
- Unique identifier for execution alias
- int
* - ``execution_id``
- Execution this alias is linked to
- int
* - ``alias``
- User given execution alias name
- str
* - ``register_date``
- Date the execution was registered
- datetime
* - ``supersede_date``
- If a new entry has been added to the table with the same alias name (but
different dataset_id), the old entry will be superseded. ``supersede_date``
in the old entry tracks when this happened. If the entry has not been
superseded, ``supersede_date`` will be None
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str

The provenance table
--------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``provenance_id``
- Unique identifier for provenance
- int
* - ``code_version_major``
- Major version of code when this schema was created
- int
* - ``code_version_minor``
- Minor version of code when this schema was created
- int
* - ``code_version_patch``
- Patch version of code when this schema was created
- int
* - ``code_version_suffix``
- Version suffix of code when this schema was created
- str
* - ``db_version_major``
- Major version of database
- int
* - ``db_version_minor``
- Minor version of database
- int
* - ``db_version_patch``
- Patch version of database
- int
* - ``git_hash``
- Git commit hash when this schema was created
- str
* - ``repo_is_clean``
- Was repository clean when this schema was created
- bool
* - ``update_method``
- "CREATE", "MODIFY" or "MIGRATE"
- str
* - ``schema_enabled_date``
- When was the schema enabled
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the schema
- str
* - ``comment``
- Any comment
- str
.. datatemplate:yaml:: ../../src/dataregistry/schema/schema.yaml
:template: schema_table.tmpl
22 changes: 22 additions & 0 deletions docs/source/templates/schema_table.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
.. -*- mode: rst -*-
{% for table in ['execution','provenance','execution_alias','dataset','dependency','dataset_alias'] %}

The {{table}} table
----------------------------------------

.. list-table::
:header-rows: 1
:class: tight-table

* - row
- description
- type

{% for item in data[table] %}
* - {{item}}
{% for item2 in ['description', 'type'] %}
- {{data[table][item][item2]}}
{% endfor %}
{% endfor %}
{% endfor %}
Loading

0 comments on commit 9edbe08

Please sign in to comment.