Skip to content

Commit

Permalink
add optional-components/stac-data-proxy (#403)
Browse files Browse the repository at this point in the history
## Overview

Provide a way to host local data that STAC API can refer to for use/download.

Currently, any STAC Asset that is referenced within responses by STAC-API Collections/Items must either be already hosted by another service of the stack (eg: CMIP6 netCDF in THREDDS), or point at some other external resource not on the server. 

Instead of having a custom config and mount point for each node, this optional component defines a standard way to define it.

## Changes

**Non-breaking changes**

- `optional-components/stac-data-proxy`: add a new feature to allow hosting of local STAC assets.

  The new component defines variables `STAC_DATA_PROXY_DIR_PATH` (default `${DATA_PERSIST_ROOT}/stac-data`) and
  `STAC_DATA_PROXY_URL_PATH` (default `/data/stac`) that are aliased (mapped) under `nginx` to provide a URL
  where locally hosted STAC assets can be downloaded from. This allows a server node to be a proper data provider,
  where its STAC-API can return Catalog, Collection and Item definitions that points at these local assets available
  through the `STAC_DATA_PROXY_URL_PATH` endpoint.

  When enabled, this component can be combined with `optional-components/secure-data-proxy` to allow per-resource
  access control of the contents under `STAC_DATA_PROXY_DIR_PATH` by setting relevant Magpie permissions under service
  `secure-data-proxy` for children resources that correspond to `STAC_DATA_PROXY_URL_PATH`. Otherwise, the path and
  all of its contents are publicly available, in the same fashion that WPS outputs are managed without
  `optional-components/secure-data-proxy`. 

  More details are provided in https://github.com/bird-house/birdhouse-deploy/blob/stac-data-proxy/birdhouse/optional-components/README.rst#provide-a-proxy-for-local-stac-asset-hosting

**Breaking changes**
- n/a

## Related Issue / Discussion

- Relates to crim-ca/stac-populator#31
- Relates to contents in https://github.com/ai-extensions/stac-data-loader/tree/main/data/EuroSAT/stac
- Relates to https://github.com/ai-extensions/stac-data-loader/blob/main/notebooks/stac_eurosat.ipynb

STAC metadata generated from above notebook (see subset for example), will be able to use a location such as `https://${PAVICS_FQDN_PUBLIC}${STAC_DATA_PROXY_URL_PATH}/EuroSAT/...` instead of the temporary raw-GitHub content URLs. The STAC populator (with `DirectoryLoading` implementation), will be able to push the STAC Collection/Items toward that instances. The STAC Assets that they refer to will be placed under `${STAC_DATA_PROXY_DIR_PATH}/EuroSAT` to make them accessible externally.
  • Loading branch information
fmigneault authored Nov 30, 2023
2 parents 7120b98 + ff9e7ca commit e408cea
Show file tree
Hide file tree
Showing 17 changed files with 173 additions and 15 deletions.
6 changes: 3 additions & 3 deletions .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.39.2
current_version = 1.40.0
commit = True
tag = False
tag_name = {new_version}
Expand Down Expand Up @@ -30,11 +30,11 @@ search = {current_version}
replace = {new_version}

[bumpversion:file:RELEASE.txt]
search = {current_version} 2023-11-30T15:28:22Z
search = {current_version} 2023-11-30T18:27:41Z
replace = {new_version} {utcnow:%Y-%m-%dT%H:%M:%SZ}

[bumpversion:part:releaseTime]
values = 2023-11-30T15:28:22Z
values = 2023-11-30T18:27:41Z

[bumpversion:file(version):birdhouse/config/canarie-api/docker_configuration.py.template]
search = 'version': '{current_version}'
Expand Down
20 changes: 20 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,26 @@
[Unreleased](https://github.com/bird-house/birdhouse-deploy/tree/master) (latest)
------------------------------------------------------------------------------------------------------------------

[//]: # (list changes here, using '-' for each new entry, remove this when items are added)

[1.40.0](https://github.com/bird-house/birdhouse-deploy/tree/1.40.0) (2023-11-30)
------------------------------------------------------------------------------------------------------------------

- `optional-components/stac-data-proxy`: add a new feature to allow hosting of local STAC assets.

The new component defines variables `STAC_DATA_PROXY_DIR_PATH` (default `${DATA_PERSIST_ROOT}/stac-data`) and
`STAC_DATA_PROXY_URL_PATH` (default `/data/stac`) that are aliased (mapped) under `nginx` to provide a URL
where locally hosted STAC assets can be downloaded from. This allows a server node to be a proper data provider,
where its STAC-API can return Catalog, Collection and Item definitions that points at these local assets available
through the `STAC_DATA_PROXY_URL_PATH` endpoint.

When enabled, this component can be combined with `optional-components/secure-data-proxy` to allow per-resource
access control of the contents under `STAC_DATA_PROXY_DIR_PATH` by setting relevant Magpie permissions under service
`secure-data-proxy` for children resources that correspond to `STAC_DATA_PROXY_URL_PATH`. Otherwise, the path and
all of its contents are publicly available, in the same fashion that WPS outputs are managed without
`optional-components/secure-data-proxy`. More details are provided under the component's
[README](./birdhouse/optional-components/README.rst#provide-a-proxy-for-local-stac-asset-hosting).

- `optional-components/stac-public-access`: add public write permission for `POST /stac/search` request.

Since [`pystac_client`](https://github.com/stac-utils/pystac-client), a common interface to interact with STAC API,
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Generic variables
override SHELL := bash
override APP_NAME := birdhouse-deploy
override APP_VERSION := 1.39.2
override APP_VERSION := 1.40.0

# utility to remove comments after value of an option variable
override clean_opt = $(shell echo "$(1)" | $(_SED) -r -e "s/[ '$'\t'']+$$//g")
Expand Down
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,13 @@ for a full-fledged production platform.
* - releases
- | |latest-version| |commits-since|

.. |commits-since| image:: https://img.shields.io/github/commits-since/bird-house/birdhouse-deploy/1.39.2.svg
.. |commits-since| image:: https://img.shields.io/github/commits-since/bird-house/birdhouse-deploy/1.40.0.svg
:alt: Commits since latest release
:target: https://github.com/bird-house/birdhouse-deploy/compare/1.39.2...master
:target: https://github.com/bird-house/birdhouse-deploy/compare/1.40.0...master

.. |latest-version| image:: https://img.shields.io/badge/tag-1.39.2-blue.svg?style=flat
.. |latest-version| image:: https://img.shields.io/badge/tag-1.40.0-blue.svg?style=flat
:alt: Latest Tag
:target: https://github.com/bird-house/birdhouse-deploy/tree/1.39.2
:target: https://github.com/bird-house/birdhouse-deploy/tree/1.40.0

.. |readthedocs| image:: https://readthedocs.org/projects/birdhouse-deploy/badge/?version=latest
:alt: ReadTheDocs Build Status (latest version)
Expand Down
2 changes: 1 addition & 1 deletion RELEASE.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.39.2 2023-11-30T15:28:22Z
1.40.0 2023-11-30T18:27:41Z
8 changes: 4 additions & 4 deletions birdhouse/config/canarie-api/docker_configuration.py.template
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,8 @@ SERVICES = {
# NOTE:
# Below version and release time auto-managed by 'make VERSION=x.y.z bump'.
# Do NOT modify it manually. See 'Tagging policy' in 'birdhouse/README.rst'.
'version': '1.39.2',
'releaseTime': '2023-11-30T15:28:22Z',
'version': '1.40.0',
'releaseTime': '2023-11-30T18:27:41Z',
'institution': 'Ouranos',
'researchSubject': 'Climatology',
'supportEmail': '${SUPPORT_EMAIL}',
Expand Down Expand Up @@ -142,8 +142,8 @@ PLATFORMS = {
# NOTE:
# Below version and release time auto-managed by 'make VERSION=x.y.z bump'.
# Do NOT modify it manually. See 'Tagging policy' in 'birdhouse/README.rst'.
'version': '1.39.2',
'releaseTime': '2023-11-30T15:28:22Z',
'version': '1.40.0',
'releaseTime': '2023-11-30T18:27:41Z',
'institution': 'Ouranos',
'researchSubject': 'Climatology',
'supportEmail': '${SUPPORT_EMAIL}',
Expand Down
49 changes: 49 additions & 0 deletions birdhouse/optional-components/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,55 @@ To enable this optional-component:
- Add ``./optional-components/stac-public-access`` to ``EXTRA_CONF_DIRS``.


Provide a proxy for local STAC asset hosting
--------------------------------------------------------

STAC data proxy allows to host the URL location defined by ``PAVICS_FQDN_PUBLIC`` and ``STAC_DATA_PROXY_URL_PATH``
to provide access to files contained within ``STAC_DATA_PROXY_DIR_PATH``.

The ``STAC_DATA_PROXY_DIR_PATH`` location can be used to hold STAC Assets defined by the current server node
(in contrast to STAC definitions that would refer to remote locations), such that the node can be the original
location of new data, or to make a new local replication of remote data.

To enable this optional-component:

- Edit ``env.local`` (a copy of `env.local.example`_)
- Add ``./optional-components/stac-data-proxy`` to ``EXTRA_CONF_DIRS``.
- Optionally, add any other relevant components to control access as desired (see below).

When using this component, access to the endpoint defined by ``STAC_DATA_PROXY_URL_PATH``, and therefore all
corresponding files contained under mapped ``STAC_DATA_PROXY_DIR_PATH`` will depend on how this
feature is combined with ``./optional-components/stac-public-access`` and ``./optional-components/secure-data-proxy``.
Following are the possible combinations and obtained behaviors:

.. list-table::
:header-rows: 1

* - Enabled Components
- Obtained Behaviors

* - Only ``./optional-components/stac-data-proxy`` is enabled.
- All data under ``STAC_DATA_PROXY_URL_PATH`` is publicly accessible without authorization control
and specific resource access cannot be managed per content. However, since STAC-API itself is not made public,
the STAC Catalog, Collections and Items cannot be accessed publicly
(*note*: this is most probably never desired).

* - Both ``./optional-components/stac-data-proxy`` and ``./optional-components/stac-public-access`` are enabled.
- All data under ``STAC_DATA_PROXY_URL_PATH`` is publicly accessible without possibility to manage per-resource
access. However, this public access is aligned with publicly accessible STAC-API endpoints and contents.

* - Both ``./optional-components/stac-data-proxy`` and ``./optional-components/secure-data-proxy`` are enabled.
- All data under ``STAC_DATA_PROXY_URL_PATH`` is protected (by default, admin-only), but can be granted access
on a per-user, per-group and per-resource basis according to permissions applied by the administrator.
Since STAC-API is not made public by default, the administrator can decide whether they grant access only to
STAC metadata (Catalog, Collection, Items) with permission applied on the ``stac`` Magpie service, only to
assets data with permission under the ``stac-data-proxy``, or both.

* - All of ``./optional-components/stac-data-proxy``, ``./optional-components/stac-public-access`` and
``./optional-components/secure-data-proxy`` are enabled.
- Similar to the previous case, allowing full authorization management control by the administrator, but contents
are publicly accessible by default. To revoke access, a Magpie administrator has to apply a ``deny`` permission.

X-Robots-Tag Header
---------------------------

Expand Down
2 changes: 2 additions & 0 deletions birdhouse/optional-components/stac-data-proxy/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
config/proxy/conf.extra-service.d/stac-proxy-data.conf
config/secure-data-proxy/permissions.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
location ${STAC_DATA_PROXY_URL_PATH}/ {
${SECURE_DATA_PROXY_AUTH_INCLUDE}

alias /stac-data-proxy/;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
version: "3.4"
services:
proxy:
volumes:
- ./optional-components/stac-data-proxy/config/proxy/conf.extra-service.d:/etc/nginx/conf.extra-service.d/stac-data-proxy:ro
# NOTE: data for hosted STAC assets, not to be confused with 'stac-db' for internal STAC catalog definitions
- ${STAC_DATA_PROXY_DIR_PATH}:/stac-data-proxy
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
version: "3.4"
services:
magpie:
volumes:
- ./optional-components/stac-data-proxy/config/secure-data-proxy/permissions.cfg:${MAGPIE_PERMISSIONS_CONFIG_PATH}/stac-data-proxy.cfg:ro
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# NOTE:
# Assume 'secure-data-proxy' would exist if needed (other component dependency).
# Since sorted load order of 'secure-data-proxy' < 'stac-data-proxy' in Magpie, 'secure-data-proxy' should exist.
permissions:
# following permission does not change anything technically (full access for admins)
# it is employed only to set up the relevant resource path and make permission customization easier by Magpie API/UI
- service: secure-data-proxy
resource: ${STAC_DATA_PROXY_URL_PATH}
type: route
permission: read
group: administrators
action: create
42 changes: 42 additions & 0 deletions birdhouse/optional-components/stac-data-proxy/default.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/bin/sh

# All env in this default.env can be overridden by env.local.

# All env in this default.env must NOT depend on any other env. If they do, they
# must use single quotes to avoid early expansion before overrides in env.local
# are applied and must be added to the list of DELAYED_EVAL.

# add any new variables not already in 'VARS' or 'OPTIONAL_VARS' that must be replaced in templates here
# single quotes are important in below list to keep variable names intact until 'pavics-compose' parses them
EXTRA_VARS='
$STAC_DATA_PROXY_DIR_PATH
$STAC_DATA_PROXY_URL_PATH
'

# extend the original 'VARS' from 'birdhouse/pavics-compose.sh' to employ them for template substitution
# adding them to 'VARS', they will also be validated in case of override of 'default.env' using 'env.local'
VARS="$VARS $EXTRA_VARS"

# Directory path that will be used as volume mount for storing hosted STAC assets data
# NOTE:
# Hosting is not performed by the API itself. Data is expected to already reside in that
# location when referenced by STAC Collections and Items to make them accessible externally.
export STAC_DATA_PROXY_DIR_PATH='${DATA_PERSIST_ROOT}/stac-data'

# URL path (after PAVICS_FQDN_PUBLIC) that will be used to proxy local STAC assets data
export STAC_DATA_PROXY_URL_PATH="/data/stac"

DELAYED_EVAL="
$DELAYED_EVAL
STAC_DATA_PROXY_DIR_PATH
"

# add any component that this component requires to run
# NOTE:
# './optional-component/secure-data-proxy' is purposely omitted from dependencies
# if 'EXTRA_CONF_DIRS' enabled it as well, the proxy path/alias will have relevant auth request enabled
# otherwise, it will use by default the public access with no prior nginx auth validation
COMPONENT_DEPENDENCIES="
./components/stac
./config/proxy
"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
config/stac-data-proxy/permissions.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
version: "3.4"
services:
magpie:
volumes:
- ./optional-components/stac-public-access/config/stac-data-proxy/permissions.cfg:${MAGPIE_PERMISSIONS_CONFIG_PATH}/stac-data-proxy-public.cfg:ro
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# NOTE:
# Assume 'secure-data-proxy' would exist if needed.
# Since 'secure-data-proxy' < 'stac-data-proxy-public', it should be loaded first.
permissions:
- service: secure-data-proxy
resource: ${STAC_DATA_PROXY_URL_PATH}
type: route
permission: read
group: anonymous
action: create
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@
# built documents.
#
# The short X.Y version.
version = '1.39.2'
version = '1.40.0'
# The full version, including alpha/beta/rc tags.
release = '1.39.2'
release = '1.40.0'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down

0 comments on commit e408cea

Please sign in to comment.