Skip to content

Commit

Permalink
Merge branch 'main' into mibe-refactor-sklearn-notebooks
Browse files Browse the repository at this point in the history
  • Loading branch information
ahsimb authored Nov 13, 2023
2 parents 52f6213 + f83662f commit dcb3898
Show file tree
Hide file tree
Showing 35 changed files with 527 additions and 507 deletions.
11 changes: 7 additions & 4 deletions .github/workflows/check_ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,11 @@ jobs:
uses: ./.github/actions/prepare_poetry_env

- name: Run pytest
run: poetry run pytest test/ci/test_install_dependencies.py
run: >
poetry run pytest
test/unit
test/integration/test_create_dss_docker_image.py
env: # Set the secret as an env variable
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_ACCESS_KEY_SECRET }}
AWS_DEFAULT_REGION: ${{ secrets.AWS_REGION }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_ACCESS_KEY_SECRET }}
AWS_DEFAULT_REGION: ${{ secrets.AWS_REGION }}
1 change: 1 addition & 0 deletions doc/changes/changes_0.1.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Version: 0.1.0

- #11: Created a notebook to show training with scikit-learn in the notebook
- #15: Installed exasol-notebook-connector via ansible
- #30: Added script to build the Data Science Sandbox as Docker Image

## Bug Fixes

Expand Down
11 changes: 6 additions & 5 deletions doc/developer_guide/developer_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,11 @@ A CLI command has normally a respective function in the `lib` submodule. Hence,

There are generally three types of commands:

| Type | Explanation |
| ----- | --------- |
| Release Commands | used during the release |
| Deployment Commands | used to deploy infrastructure onto AWS cloud |
| Development Commands | used to identify problems or for testing |
| Type | Explanation |
|----------------------|----------------------------------------------|
| Release Commands | used during the release |
| Deployment Commands | used to deploy infrastructure onto AWS cloud |
| Development Commands | used to identify problems or for testing |

### Release commands

Expand Down Expand Up @@ -71,6 +71,7 @@ The following commands can be used to deploy the infrastructure onto a given AWS
- `setup-vm-bucket` - deploys the AWS Bucket cloudformation stack which will be used to deploy the VM images
- `setup-release-codebuild` - deploys the AWS Codebuild cloudformation stack which will be used for the release-build
- `setup-vm-bucket-waf` - deploys the AWS Codebuild cloudformation stack which contains the WAF Acl configuration for the Cloudfront distribution of the VM Bucket
- `create-docker-image` - creates a Docker image for data-science-sandbox and deploys it to hub.docker.com/exasol/data-science-sandbox

## Flow

Expand Down
60 changes: 16 additions & 44 deletions doc/tutorials/transformer/masked_modelling.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,59 +17,31 @@
"1. [Configure the sandbox](../sandbox_config.ipynb).\n",
"2. [Initialize the Transformer Extension](te_init.ipynb).\n",
"\n",
"## Set up"
"## Set up\n",
"\n",
"### Access configuration"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4884f64d-aee2-4248-a922-8d28cf70209f",
"id": "3d9a1e16-acad-4f19-b105-a0e67de4a0e6",
"metadata": {},
"outputs": [],
"source": [
"%run ../access_store_ui.ipynb\n",
"display(get_access_store_ui('../'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "de9d7c52-ee05-45d5-a412-fd47cbf1d404",
"metadata": {},
"outputs": [],
"source": [
"from collections import UserDict\n",
"\n",
"class Secrets(UserDict):\n",
" \"\"\"This class mimics the Secret Store we will start using soon.\"\"\"\n",
"\n",
" def save(self, key: str, value: str) -> \"Secrets\":\n",
" self[key] = value\n",
" return self\n",
"\n",
"def get_value_as_attribute(self, key):\n",
" val = self.get(key)\n",
" if val is None:\n",
" raise AttributeError(f'{key} value is not defined')\n",
" return val\n",
"\n",
"Secrets.__getattr__ = get_value_as_attribute\n",
"\n",
"# For now just hardcode the configuration.\n",
"sb_config = Secrets({\n",
" 'EXTERNAL_HOST_NAME': '192.168.124.93',\n",
" 'HOST_PORT': '8888',\n",
" 'USER': 'sys',\n",
" 'PASSWORD': 'exasol',\n",
" 'BUCKETFS_PORT': '6666',\n",
" 'BUCKETFS_USER': 'w',\n",
" 'BUCKETFS_PASSWORD': 'write',\n",
" 'BUCKETFS_USE_HTTPS': 'False',\n",
" 'BUCKETFS_SERVICE': 'bfsdefault',\n",
" 'BUCKETFS_BUCKET': 'default',\n",
" 'SCRIPT_LANGUAGE_NAME': 'PYTHON3_60',\n",
" 'UDF_FLAVOR': 'python3-ds-EXASOL-6.0.0',\n",
" 'UDF_RELEASE': '20190116',\n",
" 'UDF_CLIENT': 'exaudfclient_py3',\n",
" 'SCHEMA': 'IDA',\n",
" 'TE_TOKEN': '',\n",
" 'TE_TOKEN_CONN': '',\n",
" 'TE_BFS_CONN': 'MyBFSConn',\n",
" 'TE_BFS_DIR': 'my_storage',\n",
" 'TE_MODELS_BFS_DIR': 'models',\n",
" 'TE_MODELS_CACHE_DIR': 'models_cache'\n",
"})\n",
"\n",
"EXTERNAL_HOST = f\"{sb_config.EXTERNAL_HOST_NAME}:{sb_config.HOST_PORT}\"\n",
"\n",
"WEBSOCKET_URL = f\"exa+websocket://{sb_config.USER}:{sb_config.PASSWORD}\" \\\n",
" f\"@{EXTERNAL_HOST}/{sb_config.SCHEMA}?SSLCertificate=SSL_VERIFY_NONE\""
]
Expand Down
60 changes: 16 additions & 44 deletions doc/tutorials/transformer/question_answering.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,59 +17,31 @@
"1. [Configure the sandbox](../sandbox_config.ipynb).\n",
"2. [Initialize the Transformer Extension](te_init.ipynb).\n",
"\n",
"## Set up"
"## Set up\n",
"\n",
"### Access configuration"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a28bb232-38d5-445a-9e7a-6f72e80bc2cc",
"id": "264c9323-7f40-40ca-93cf-db4853470206",
"metadata": {},
"outputs": [],
"source": [
"%run ../access_store_ui.ipynb\n",
"display(get_access_store_ui('../'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38994c13-a7eb-45bb-b0d0-4266830a481d",
"metadata": {},
"outputs": [],
"source": [
"from collections import UserDict\n",
"\n",
"class Secrets(UserDict):\n",
" \"\"\"This class mimics the Secret Store we will start using soon.\"\"\"\n",
"\n",
" def save(self, key: str, value: str) -> \"Secrets\":\n",
" self[key] = value\n",
" return self\n",
"\n",
"def get_value_as_attribute(self, key):\n",
" val = self.get(key)\n",
" if val is None:\n",
" raise AttributeError(f'{key} value is not defined')\n",
" return val\n",
"\n",
"Secrets.__getattr__ = get_value_as_attribute\n",
"\n",
"# For now just hardcode the configuration.\n",
"sb_config = Secrets({\n",
" 'EXTERNAL_HOST_NAME': '192.168.124.93',\n",
" 'HOST_PORT': '8888',\n",
" 'USER': 'sys',\n",
" 'PASSWORD': 'exasol',\n",
" 'BUCKETFS_PORT': '6666',\n",
" 'BUCKETFS_USER': 'w',\n",
" 'BUCKETFS_PASSWORD': 'write',\n",
" 'BUCKETFS_USE_HTTPS': 'False',\n",
" 'BUCKETFS_SERVICE': 'bfsdefault',\n",
" 'BUCKETFS_BUCKET': 'default',\n",
" 'SCRIPT_LANGUAGE_NAME': 'PYTHON3_60',\n",
" 'UDF_FLAVOR': 'python3-ds-EXASOL-6.0.0',\n",
" 'UDF_RELEASE': '20190116',\n",
" 'UDF_CLIENT': 'exaudfclient_py3',\n",
" 'SCHEMA': 'IDA',\n",
" 'TE_TOKEN': '',\n",
" 'TE_TOKEN_CONN': '',\n",
" 'TE_BFS_CONN': 'MyBFSConn',\n",
" 'TE_BFS_DIR': 'my_storage',\n",
" 'TE_MODELS_BFS_DIR': 'models',\n",
" 'TE_MODELS_CACHE_DIR': 'models_cache'\n",
"})\n",
"\n",
"EXTERNAL_HOST = f\"{sb_config.EXTERNAL_HOST_NAME}:{sb_config.HOST_PORT}\"\n",
"\n",
"WEBSOCKET_URL = f\"exa+websocket://{sb_config.USER}:{sb_config.PASSWORD}\" \\\n",
" f\"@{EXTERNAL_HOST}/{sb_config.SCHEMA}?SSLCertificate=SSL_VERIFY_NONE\""
]
Expand Down
60 changes: 16 additions & 44 deletions doc/tutorials/transformer/sequence_classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,59 +17,31 @@
"1. [Configure the sandbox](../sandbox_config.ipynb).\n",
"2. [Initialize the Transformer Extension](te_init.ipynb).\n",
"\n",
"## Set up"
"## Set up\n",
"\n",
"### Access configuration"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b4ef3baf-8292-4db0-b86b-88a110d6feb3",
"id": "0ca8f5fb-0fe7-4f07-895d-0857d6c82af2",
"metadata": {},
"outputs": [],
"source": [
"%run ../access_store_ui.ipynb\n",
"display(get_access_store_ui('../'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "78339714-209a-4657-be0d-69dd02ef52b4",
"metadata": {},
"outputs": [],
"source": [
"from collections import UserDict\n",
"\n",
"class Secrets(UserDict):\n",
" \"\"\"This class mimics the Secret Store we will start using soon.\"\"\"\n",
"\n",
" def save(self, key: str, value: str) -> \"Secrets\":\n",
" self[key] = value\n",
" return self\n",
"\n",
"def get_value_as_attribute(self, key):\n",
" val = self.get(key)\n",
" if val is None:\n",
" raise AttributeError(f'{key} value is not defined')\n",
" return val\n",
"\n",
"Secrets.__getattr__ = get_value_as_attribute\n",
"\n",
"# For now just hardcode the configuration.\n",
"sb_config = Secrets({\n",
" 'EXTERNAL_HOST_NAME': '192.168.124.93',\n",
" 'HOST_PORT': '8888',\n",
" 'USER': 'sys',\n",
" 'PASSWORD': 'exasol',\n",
" 'BUCKETFS_PORT': '6666',\n",
" 'BUCKETFS_USER': 'w',\n",
" 'BUCKETFS_PASSWORD': 'write',\n",
" 'BUCKETFS_USE_HTTPS': 'False',\n",
" 'BUCKETFS_SERVICE': 'bfsdefault',\n",
" 'BUCKETFS_BUCKET': 'default',\n",
" 'SCRIPT_LANGUAGE_NAME': 'PYTHON3_60',\n",
" 'UDF_FLAVOR': 'python3-ds-EXASOL-6.0.0',\n",
" 'UDF_RELEASE': '20190116',\n",
" 'UDF_CLIENT': 'exaudfclient_py3',\n",
" 'SCHEMA': 'IDA',\n",
" 'TE_TOKEN': '',\n",
" 'TE_TOKEN_CONN': '',\n",
" 'TE_BFS_CONN': 'MyBFSConn',\n",
" 'TE_BFS_DIR': 'my_storage',\n",
" 'TE_MODELS_BFS_DIR': 'models',\n",
" 'TE_MODELS_CACHE_DIR': 'models_cache'\n",
"})\n",
"\n",
"EXTERNAL_HOST = f\"{sb_config.EXTERNAL_HOST_NAME}:{sb_config.HOST_PORT}\"\n",
"\n",
"WEBSOCKET_URL = f\"exa+websocket://{sb_config.USER}:{sb_config.PASSWORD}\" \\\n",
" f\"@{EXTERNAL_HOST}/{sb_config.SCHEMA}?SSLCertificate=SSL_VERIFY_NONE\""
]
Expand Down
61 changes: 18 additions & 43 deletions doc/tutorials/transformer/te_init.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,54 +16,29 @@
"Prior to using this notebook one needs to complete the following steps:\n",
"1. [Configure the sandbox](../sandbox_config.ipynb).\n",
"\n",
"## Set up"
"## Set up\n",
"\n",
"### Access configuration"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7433e5e3-5258-4773-b202-7aa1b05303ef",
"id": "a8068d04-ef01-443a-abb5-2b0987521b1a",
"metadata": {},
"outputs": [],
"source": [
"%run ../access_store_ui.ipynb\n",
"display(get_access_store_ui('../'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "afbaf706-cc3b-498d-8e7a-3e866ac439f1",
"metadata": {},
"outputs": [],
"source": [
"#TODO: start using the secret store.\n",
"\n",
"from collections import UserDict\n",
"\n",
"class Secrets(UserDict):\n",
" \"\"\"This class mimics the Secret Store we will start using soon.\"\"\"\n",
"\n",
" def save(self, key: str, value: str) -> \"Secrets\":\n",
" self[key] = value\n",
" return self\n",
"\n",
"def get_value_as_attribute(self, key):\n",
" val = self.get(key)\n",
" if val is None:\n",
" raise AttributeError(f'{key} value is not defined')\n",
" return val\n",
"\n",
"Secrets.__getattr__ = get_value_as_attribute\n",
"\n",
"# For now just hardcode the configuration.\n",
"sb_config = Secrets({ \n",
" 'EXTERNAL_HOST_NAME': '192.168.124.93',\n",
" 'HOST_PORT': '8888',\n",
" 'USER': 'sys',\n",
" 'PASSWORD': 'exasol',\n",
" 'BUCKETFS_PORT': '6666',\n",
" 'BUCKETFS_USER': 'w',\n",
" 'BUCKETFS_PASSWORD': 'write',\n",
" 'BUCKETFS_USE_HTTPS': 'False',\n",
" 'BUCKETFS_SERVICE': 'bfsdefault',\n",
" 'BUCKETFS_BUCKET': 'default',\n",
" 'SCRIPT_LANGUAGE_NAME': 'PYTHON3_60',\n",
" 'UDF_FLAVOR': 'python3-ds-EXASOL-6.0.0',\n",
" 'UDF_RELEASE': '20190116',\n",
" 'UDF_CLIENT': 'exaudfclient_py3',\n",
" 'SCHEMA': 'IDA'\n",
"})\n",
"\n",
"EXTERNAL_HOST = f\"{sb_config.EXTERNAL_HOST_NAME}:{sb_config.HOST_PORT}\""
]
},
Expand All @@ -78,14 +53,14 @@
{
"cell_type": "code",
"execution_count": null,
"id": "6bcaef63-06a3-40c1-aaa2-552d78c77125",
"id": "013f5167-f1e3-470c-b0c7-00580b9cd98e",
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"\n",
"# Huggingface token required for downloading private models.\n",
"sb_config.save('TE_TOKEN', '')\n",
"sb_config.save('TE_TOKEN', '-')\n",
"\n",
"# Name of the connection encapsulating the Huggingface token. Leave it empty if the token is not used.\n",
"sb_config.save('TE_TOKEN_CONN', '')\n",
Expand All @@ -99,7 +74,7 @@
"# We will store all models in this sub-directory at BucketFS.\n",
"sb_config.save('TE_MODELS_BFS_DIR', 'models')\n",
" \n",
"# We will save cached model in this sub-directory relative to the current directory on the local machine.\n",
"# We will save a cached model in this sub-directory relative to the current directory on the local machine.\n",
"sb_config.save('TE_MODELS_CACHE_DIR', 'models_cache')"
]
},
Expand Down
Loading

0 comments on commit dcb3898

Please sign in to comment.