Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update main #48

Merged
merged 23 commits into from
Nov 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
31fc921
add build-service script
danielfromearth Oct 31, 2023
c610db3
add .dockerignore
danielfromearth Oct 31, 2023
90a70b1
use updated poetry install argument
danielfromearth Oct 31, 2023
261f0e3
perform additional apt-get cleanup
danielfromearth Oct 31, 2023
6e812a1
remove extra user switching from Dockerfile
danielfromearth Oct 31, 2023
d9a8ae9
exclude all tests from docker build
danielfromearth Nov 1, 2023
3325b97
remove comment
danielfromearth Nov 1, 2023
4c32719
Bump ruff from 0.1.3 to 0.1.4
dependabot[bot] Nov 6, 2023
cafbdfd
Merge pull request #40 from danielfromearth/dependabot/pip/develop/ru…
danielfromearth Nov 6, 2023
6095e05
update poetry.lock
danielfromearth Nov 8, 2023
ca6f485
change output of process_catalogs from single Catalog to list of Cata…
danielfromearth Nov 8, 2023
034dc83
Bump mypy from 1.6.1 to 1.7.0
dependabot[bot] Nov 13, 2023
f503b5a
Merge pull request #46 from danielfromearth/dependabot/pip/develop/my…
danielfromearth Nov 13, 2023
d9574e2
Bump ruff from 0.1.4 to 0.1.5
dependabot[bot] Nov 13, 2023
cad452c
Merge pull request #45 from danielfromearth/dependabot/pip/develop/ru…
danielfromearth Nov 13, 2023
eedbd73
Bump black from 23.10.1 to 23.11.0
dependabot[bot] Nov 13, 2023
e31794e
Merge pull request #44 from danielfromearth/dependabot/pip/develop/bl…
danielfromearth Nov 13, 2023
f8eeb91
Bump harmony-service-lib from 1.0.23 to 1.0.24
dependabot[bot] Nov 13, 2023
f796819
Merge pull request #43 from danielfromearth/dependabot/pip/develop/ha…
danielfromearth Nov 13, 2023
e80fbc4
Bump ruff from 0.1.5 to 0.1.6
dependabot[bot] Nov 20, 2023
7a7acfb
Merge pull request #47 from danielfromearth/dependabot/pip/develop/ru…
danielfromearth Nov 20, 2023
eefef42
update CHANGELOG.md
danielfromearth Nov 20, 2023
82292be
Merge pull request #42 from danielfromearth/feature/issue-41
danielfromearth Nov 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 102 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
TEMPO_*.nc
tests/
.git
.gitignore
.travis.yaml
.swagger-codegen-ignore
tox.ini

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/
venv/
.python-version

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/

#Ipython Notebook
.ipynb_checkpoints

# Intellij project settings
.idea

# VSCode project settings
.vscode

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# ruff
.ruff_cache/
# pytest
.pytest_cache/

# Pyre type checker
.pyre/

# Github integration settings
.github

# .dockerignore and builder script themselves are not needed by
# docker build.
.dockerignore
build-service
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed
- [issue/11](https://github.com/danielfromearth/batchee/issues/11): Rename from concat_batcher to batchee
- [issue/21](https://github.com/danielfromearth/batchee/issues/21): Improve CICD workflows
- [issue/41](https://github.com/danielfromearth/batchee/issues/41): Change Adapter output from single to multiple STAC Catalogs
### Deprecated
### Removed
### Fixed
10 changes: 2 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ RUN apt-get update \
&& pip3 install --upgrade pip \
&& pip3 install cython \
&& pip3 install poetry \
&& apt-get clean
&& apt-get clean && rm -rf /var/lib/apt/lists/*


# Create a new user
Expand All @@ -30,22 +30,16 @@ ARG DIST_PATH
USER root
RUN mkdir -p /worker && chown dockeruser /worker
COPY pyproject.toml /worker
# COPY ../pyproject.toml /worker
USER dockeruser

WORKDIR /worker

# ENV PYTHONPATH=${PYTHONPATH}:${PWD}

COPY --chown=dockeruser $DIST_PATH $DIST_PATH
USER dockeruser
#RUN pip3 install --no-cache-dir --force --user --index-url https://pypi.org/simple/ --extra-index-url https://test.pypi.org/simple/ $SOURCE \
# && rm -rf $DIST_PATH

#install poetry as root
USER root
RUN poetry config virtualenvs.create false
RUN poetry install --no-dev
RUN poetry install --only main

USER dockeruser
COPY --chown=dockeruser ./docker-entrypoint.sh docker-entrypoint.sh
Expand Down
54 changes: 29 additions & 25 deletions batcher/harmony/service_adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
from pystac.item import Asset

from batcher.harmony.util import (
_get_item_url,
_get_netcdf_urls,
_get_output_bounding_box,
_get_output_date_range,
)
from batcher.tempo_filename_parser import get_batch_indices
Expand Down Expand Up @@ -48,21 +48,20 @@ def invoke(self):

return self.message, self.process_catalog(self.catalog)

def process_catalog(self, catalog: pystac.Catalog):
def process_catalog(self, catalog: pystac.Catalog) -> list[pystac.Catalog]:
"""Converts a list of STAC catalogs into a list of lists of STAC catalogs."""
self.logger.info("process_catalog() started.")
try:
result = catalog.clone()
result.id = str(uuid4())
result.clear_children()

# Get all the items from the catalog, including from child or linked catalogs
items = list(self.get_all_catalog_items(catalog))

self.logger.info(f"length of items==={len(items)}.")

# Quick return if catalog contains no items
if len(items) == 0:
result = catalog.clone()
result.id = str(uuid4())
result.clear_children()
return result

# # --- Get granule filepaths (urls) ---
Expand All @@ -79,38 +78,43 @@ def process_catalog(self, catalog: pystac.Catalog):
for k, v in zip(batch_indices, items):
grouped.setdefault(k, []).append(v)

# --- Construct a STAC Catalog that holds multiple Items (which represent each TEMPO scan),
# and each Item holds multiple Assets (which represent each granule).
result.clear_items()

# --- Construct a list of STAC Catalogs (which represent each TEMPO scan),
# and each Catalog holds multiple Items (which represent each granule).
catalogs = []
for batch_id, batch_items in grouped.items():
batch_urls: list[str] = _get_netcdf_urls(batch_items)
bounding_box = _get_output_bounding_box(batch_items)
properties = _get_output_date_range(batch_items)

self.logger.info(f"constructing new pystac.Item for batch_id==={batch_id}.")

# Construct a new pystac.Item with every granule in the batch as a pystac.Asset
output_item = Item(
str(uuid4()), bbox_to_geometry(bounding_box), bounding_box, None, properties
)
self.logger.info(f"constructing new pystac.Catalog for batch_id==={batch_id}.")
# Initialize a new, empty Catalog
batch_catalog = catalog.clone()
batch_catalog.id = str(uuid4())
batch_catalog.clear_children()
batch_catalog.clear_items()

for idx, item in enumerate(batch_items):
# Construct a new pystac.Item for each granule in the batch
output_item = Item(
str(uuid4()),
bbox_to_geometry(item.bbox),
item.bbox,
None,
_get_output_date_range([item]),
)
output_item.add_asset(
f"data_{idx}",
Asset(
batch_urls[idx],
title=batch_urls[idx],
_get_item_url(item),
title=_get_item_url(item),
media_type="application/x-netcdf4",
roles=["data"],
),
)
batch_catalog.add_item(output_item)

result.add_item(output_item)
self.logger.info("STAC catalog creation for batch_id==={batch_id} complete.")
catalogs.append(batch_catalog)

self.logger.info("STAC catalog creation complete.")
self.logger.info("All STAC catalogs are complete.")

return result
return catalogs

except Exception as service_exception:
self.logger.error(service_exception, exc_info=1)
Expand Down
3 changes: 3 additions & 0 deletions build-service
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

docker build -t "asdc-trade/batchee:${VERSION-latest}" .
Loading