Skip to content

Commit

Permalink
deploy: c6057d5
Browse files Browse the repository at this point in the history
  • Loading branch information
stuartmcalpine committed Nov 8, 2023
0 parents commit 0115507
Show file tree
Hide file tree
Showing 74 changed files with 8,668 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: f032b379b277d0b6e67adf305e51553b
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file added .doctrees/contact.doctree
Binary file not shown.
Binary file added .doctrees/environment.pickle
Binary file not shown.
Binary file added .doctrees/index.doctree
Binary file not shown.
Binary file added .doctrees/installation.doctree
Binary file not shown.
Binary file added .doctrees/reference_cli.doctree
Binary file not shown.
Binary file added .doctrees/reference_python.doctree
Binary file not shown.
Binary file added .doctrees/reference_schema.doctree
Binary file not shown.
Binary file added .doctrees/tutorial_cli.doctree
Binary file not shown.
Binary file added .doctrees/tutorial_python.doctree
Binary file not shown.
Binary file added .doctrees/tutorial_setup.doctree
Binary file not shown.
Empty file added .nojekyll
Empty file.
Binary file added _images/schema_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions _sources/contact.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Contact
=======

If you need to contact someone directly for assistance we are happy to help!

- Admin: **Joanne Bogart** (`@JoanneBogart <https://www.github.com/JoanneBogart>`__)
- Admin: **Stuart McAlpine** (`@stuartmcalpine <https://www.github.com/stuartmcalpine>`__)

For any bugs, or suggestions for additional features within the DESC data
management software, please raise an issue via the GitHub repository
(`https://github.com/LSSTDESC/dataregistry/issues
<https://github.com/LSSTDESC/dataregistry/issues>`__).
68 changes: 68 additions & 0 deletions _sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
Welcome to the DESC data management's software documentation
============================================================

The data registry is a means of keeping track of DESC related datasets,
providing both a shared space at NERSC for the raw data, and a registry
database to store provenance information for that data, for example:

- where the data is located
- when the data was produced
- what precursor datasets it relies on

What and whom is it for?
------------------------

It is for any datasets for which provenance and accessibility are important, e.g.

- they are of general interest within the collaboration
- they are used as input to further analysis steps
- they are referenced in a paper

It is for anyone at DESC who needs to create, find or access such a dataset.

Getting started
---------------

This documentation is to help you get set up using the ``dataregistry`` Python
package; covering installation, how to register datasets, and how to query for
them.

.. toctree::
:maxdepth: 2
:caption: Overview:
:hidden:

installation

.. toctree::
:maxdepth: 2
:caption: Tutorials:
:hidden:

tutorial_setup
tutorial_python
tutorial_cli

.. toctree::
:maxdepth: 2
:caption: Reference:
:hidden:

reference_python
reference_cli
reference_schema

.. toctree::
:maxdepth: 2
:caption: Contact:
:hidden:

contact


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
109 changes: 109 additions & 0 deletions _sources/installation.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
.. _installation:

Installation
============

Currently the DESC data registry software can only be used at NERSC (i.e.,
PerlMutter).

Main installation steps
-----------------------

When installing the ``dataregistry`` package, it is recommended to work within
your own Conda or Python virtual environment.

Creating a Conda environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can make a new Conda environment via

.. code-block:: bash
module load conda/Mambaforge-22.11.1-4
conda create -p ./datareg_env psycopg2
where ``./datareg_env`` is the path where the environment will be installed
(change this as required). To activate the environment do

.. code-block:: bash
conda activate <path to your env>
Creating a Python venv
~~~~~~~~~~~~~~~~~~~~~~

or, you can work within a Python virtual environment via

.. code-block:: bash
module load python/3.10
python3 -m venv ./datareg_env
where ``./datareg_env`` is the path where the environment will be installed
(change this as required). To activate the environment do

.. code-block:: bash
source <path to your env>/bin/activate
Note the specific version of Python used above (``3.10``) is only an example,
the ``dataregistry`` package is supported on Python versions ``>3.7``.

Installing the ``dataregistry`` package
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Now we can install the DESC data registry software. First clone the GitHub
repository

.. code-block:: bash
git clone https://github.com/LSSTDESC/dataregistry.git
then, navigate to the ``dataregistry`` directory and install via *pip* using

.. code-block:: bash
python3 -m pip install .
You can test to see if the ``dataregistry`` package has installed successfully
by typing

.. code-block:: bash
python3 -c "import dataregistry; print(dataregistry.__version__)"
If you see the current package version printed to the console, success!

.. _one-time-setup:

Authenticating with the database
--------------------------------

A one-time setup is required in order to authenticate with the DESC data
registry database. This is done via a YAML configuration file which stores the
connection information to the database, and a ``.pgpass`` file, which stores
user credentials.

First, make a ``dataregistry`` configuration file. We recommend a file named
``~/.config_reg_access`` stored in your ``$HOME`` directory, containing the
entry

.. code-block:: yaml
sqlalchemy.url : postgresql://reg_writer@data-registry-dev-loadbalancer.jrb-test.development.svc.spin.nersc.org:5432/desc_data_registry
Then (if you don't have one already), create a file named ``~/.pgpass`` in your
``$HOME`` directory, and append the entry

.. code-block:: bash
# data registry db
data-registry-dev-loadbalancer.jrb-test.development.svc.spin.nersc.org:5432:desc_data_registry:reg_writer:<password>
where ``<password>`` is provided on demand by the DESC data registry admins. As
a final step, the ``.pgpass`` file must only be readable by you, which you
can ensure by doing

.. code-block:: bash
chmod 600 .pgpass
25 changes: 25 additions & 0 deletions _sources/reference_cli.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
.. _dregs_cli:

The ``dregs`` CLI
=================

The DESC data registry also comes with a Command Line Interface (CLI) tool,
``dregs``, which can perform some simple actions.

See the :ref:`tutorials section <tutorials-cli>` for a demonstration of its usage.

Registering a new entry in the database
---------------------------------------

.. autoprogram:: cli.cli:arg_register
:prog: dregs register

Listing datasets within the data registry
-----------------------------------------

The ``dregs ls`` command can be used to quickly list the datasets within the
DESC data registry. Two basic filters can be applied; on the `owner` and/or
`owner_type`. All entries can also be retured using the ``--all`` flag.

.. autoprogram:: cli.cli:arg_ls
:prog: dregs ls
26 changes: 26 additions & 0 deletions _sources/reference_python.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
The ``dataregistry`` package
============================

Reference documentation for the core objects within the ``dataregistry``
package. Demonstrations of their usage can be found in the :ref:`tutorials section <tutorials-python>`.

.. _dataregistry_class:

The DataRegistry class
----------------------

The ``DataRegistry`` class is the primary front end to the ``dataregistry`` package.
This should be the only object users have to import to their code.

It connects the user to the database, and serves as a wrapper to both the
``Registrar`` and ``Query`` classes.

.. autoclass:: dataregistry.DataRegistry
:members:

.. automethod:: dataregistry.Registrar.register_dataset
.. automethod:: dataregistry.Registrar.get_owner_types
.. automethod:: dataregistry.Registrar.register_execution
.. automethod:: dataregistry.Registrar.register_dataset_alias
.. automethod:: dataregistry.Query.find_datasets

Loading

0 comments on commit 0115507

Please sign in to comment.