Skip to content

Commit

Permalink
[Feature] LLM Stack (#191)
Browse files Browse the repository at this point in the history
* [RPD-282] adds stack cli command (#179)

* adds tests and logic for stack and stack set cli commands

* fixes typos

* updated for comments

* addresses comments

* [RPD-285] Create LLM stack Terraform files (#182)

* Add initial LLM Terraform files

* Update Terraform Kubernetes Chroma deployment

* Update Terraform docs

* Revert default variables

* Add missing aks module

* [RPD-283] add core function for updating stack type (#185)

* [RPD-287] ZenML version inference for zenserver (#180)

* various changes

* change variable names to make things more clear

* shubham's comment

* test for latest

* update existing tests

* update existing tests

* shubham latest comment

* add gitflow notation branches to ci (#183)

* remove old, commented out code

* [RPD-260] Add an object to handle the `matcha.config.json` file. (#184)

* adds logic and tests for matcha config module

* updates docstrings

* adds tests and implements config object throughout matcha

* updated for pr comments

* updates docstring

* fixes ci

* updates for comments

* enum and metaclass

* enum and metaclass

* bug removal

* docstring stack_set

* docstring file exists

* fix current tests

* add a couple tests, move logic for updating to configservice

* lowercase-ify arg to enum

* american spelling

* american spelling

* chris' comment re overwriting

* friendlier api

* Runtime error if not recognised arg type fpor update()

* Runtime error if not recognised arg type fpor update()

* update provision to use new update() API

* update stack name

* tests

* tests

* tests

* quote marks

* type version, mypy

* various review comments

* test for update() function

* update tests to include quotes.

* fixing tests

* tidy tests, test error if resources provisioned

* clearer error handling

* remove unnecessary context

* update test to new error type

---------

Co-authored-by: Callum Wells <[email protected]>

* [RPD-286] Add documentation on "stacks" and the new LLM stack (#189)

* Initial version of docs

* Add motivation to stacks docs

* Various docs updates based on reviews

* [RPD-284] Update AzureRunner to select the Terraform files corresponding to the stack name in the config file (#187)

* [RPD-287] ZenML version inference for zenserver (#180)

* various changes

* change variable names to make things more clear

* shubham's comment

* test for latest

* update existing tests

* update existing tests

* shubham latest comment

* add gitflow notation branches to ci (#183)

* remove old, commented out code

* [RPD-260] Add an object to handle the `matcha.config.json` file. (#184)

* adds logic and tests for matcha config module

* updates docstrings

* adds tests and implements config object throughout matcha

* updated for pr comments

* updates docstring

* fixes ci

* updates for comments

* enum and metaclass

* enum and metaclass

* bug removal

* docstring stack_set

* docstring file exists

* fix current tests

* add a couple tests, move logic for updating to configservice

* lowercase-ify arg to enum

* american spelling

* american spelling

* chris' comment re overwriting

* friendlier api

* Runtime error if not recognised arg type fpor update()

* Runtime error if not recognised arg type fpor update()

* update provision to use new update() API

* update stack name

* tests

* tests

* tests

* Move files for selection

* Clean up file

* Remove a test that no longer describes the expected functionality

* Remove a test that no longer describes the expected functionality

* Fix tests

* Stop creation of config file in local directory when running tests

* Update check to use MatchaConfig object instead of to dict

* Add get current stack name function to MatchaConfigService

* Update get stack function

* Fix test

* Remove unnecessary lower casing

* Fix tests

* updates stack handling

* Revert "updates stack handling"

This reverts commit b42b31d.

* [RPD-289] Update the Chroma Terraform within the LLM stack to use Helm #194

* [RPD-292] [BUG] Update AzureTemplate to not create redundant folders during provisioning (#195)

* splits default stack

* updates llm stack

* [RPD-303] Update stack_set docstring to include example and raises (#196)

---------

Co-authored-by: Callum Wells <[email protected]>
Co-authored-by: KirsoppJ <[email protected]>
Co-authored-by: Jonathan Carlton <[email protected]>
  • Loading branch information
4 people authored Aug 15, 2023
1 parent dc02ab3 commit 65a7795
Show file tree
Hide file tree
Showing 168 changed files with 4,786 additions and 710 deletions.
52 changes: 52 additions & 0 deletions docs/resource-stacks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Resource Stacks 📚

Machine Learning projects often vary in their size, from small-scale experimentation to large deployments, meaning that the infrastructure requirements also change and scale. For example, the infrastructure stack needed for deploying an LLM may require a GPU or vector database, which aren't usually needed in more general machine learning use-cases.

Matcha accommodates both of these requirements, and currently offers two infrastructure stacks which we'll discuss in more detail here and show how you can get started with either.

> Note: These stacks must be set before provisioning any resources and cannot be change whilst a Matcha deployment exists.
## Available stacks

### DEFAULT

The `DEFAULT` stack. This stack is ideal for generic machine learning training and deployments and a good starting point. It includes:
* [Azure Kubernetes Service](https://azure.microsoft.com/en-gb/products/kubernetes-service)
* [ZenML](https://www.zenml.io/home)
* [Seldon Core](https://www.seldon.io/solutions/open-source-projects/core) (deployment)
* [MLflow](https://mlflow.org/) (experiment tracking)
* Data version control storage bucket

This is the stack used in the [getting started page](getting-started.md). Follow the link for more information.

### LLM

The `LLM` stack: This includes everything found within the `DEFAULT` stack with the addition of a vector database - Chroma DB. This stack is modified for the training and deployment of Large Language Models (LLMs).

* [Azure Kubernetes Service](https://azure.microsoft.com/en-gb/products/kubernetes-service)
* [ZenML](https://www.zenml.io/home)
* [Seldon Core](https://www.seldon.io/solutions/open-source-projects/core) (deployment)
* [MLflow](https://mlflow.org/) (experiment tracking)
* Data version control storage bucket
* [Chroma DB](https://www.trychroma.com/) (vector database for document retrieval)


We use this stack for [MindGPT](https://github.com/fuzzylabs/MindGPT), our large language model for mental health question answering.

## How to switch your stack

To switch your stack to the 'DEFAULT' stack, run the following command:

```bash
$ matcha stack set default
```

or for the 'LLM' stack:

```bash
$ matcha stack set llm
```

If no stack is set Matcha will use the 'default' stack.

See the [API documentation](references.md) for more information.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ nav:
- Azure Costs: 'costings.md'
- Inside Matcha:
- How does Matcha work: 'inside-matcha.md'
- Resource stacks: 'resource-stacks.md'
- Why we collect usage data: 'privacy.md'
- Tools:
- Data Version Control: 'data-version-control.md'
Expand Down
924 changes: 476 additions & 448 deletions poetry.lock

Large diffs are not rendered by default.

24 changes: 24 additions & 0 deletions src/matcha_ml/cli/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,17 @@

app = typer.Typer(no_args_is_help=True, pretty_exceptions_show_locals=False)
analytics_app = typer.Typer(no_args_is_help=True, pretty_exceptions_show_locals=False)
stack_app = typer.Typer(no_args_is_help=True, pretty_exceptions_show_locals=False)
app.add_typer(
analytics_app,
name="analytics",
help="Enable or disable the collection of anonymous usage data (enabled by default).",
)
app.add_typer(
stack_app,
name="stack",
help="Configure the stack for Matcha to provision.",
)


def fill_provision_variables(
Expand Down Expand Up @@ -242,5 +248,23 @@ def opt_in() -> None:
core.analytics_opt_in()


@stack_app.command(help="Define the stack for Matcha to provision.")
def set(stack: str = typer.Argument("default")) -> None:
"""Define the stack for Matcha to provision.
Args:
stack (str): the name of the stack to provision.
"""
try:
core.stack_set(stack)
print_status(build_status(f"Matcha '{stack}' stack has been set."))
except MatchaInputError as e:
print_error(str(e))
raise typer.Exit()
except MatchaError as e:
print_error(str(e))
raise typer.Exit()


if __name__ == "__main__":
app()
16 changes: 16 additions & 0 deletions src/matcha_ml/config/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""Matcha state sub-module."""
from .matcha_config import (
DEFAULT_CONFIG_NAME,
MatchaConfig,
MatchaConfigComponent,
MatchaConfigComponentProperty,
MatchaConfigService,
)

__all__ = [
"MatchaConfigService",
"MatchaConfig",
"MatchaConfigComponentProperty",
"MatchaConfigComponent",
"DEFAULT_CONFIG_NAME",
]
226 changes: 226 additions & 0 deletions src/matcha_ml/config/matcha_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
"""The matcha.config.json file interface."""
import json
import os
from dataclasses import dataclass
from typing import Dict, List, Optional, Union

from matcha_ml.errors import MatchaError

DEFAULT_CONFIG_NAME = "matcha.config.json"


@dataclass
class MatchaConfigComponentProperty:
"""A class to represent Matcha config properties."""

name: str
value: str


@dataclass
class MatchaConfigComponent:
"""A class to represent Matcha config components."""

name: str
properties: List[MatchaConfigComponentProperty]

def find_property(self, property_name: str) -> MatchaConfigComponentProperty:
"""Given a property name, find the property that matches it.
Note: this only works under the assumption of none-duplicated properties.
Args:
property_name (str): the name of the property.
Raises:
MatchaError: if the property could not be found.
Returns:
MatchaConfigComponentProperty: the property that matches the property_name parameter.
"""
property = next(
filter(lambda property: property.name == property_name, self.properties),
None,
)

if property is None:
raise MatchaError(
f"The property with the name '{property_name}' could not be found."
)

return property


@dataclass
class MatchaConfig:
"""A class to represent the Matcha config file."""

components: List[MatchaConfigComponent]

def find_component(self, component_name: str) -> MatchaConfigComponent:
"""Given a component name, find the component that matches it.
Note: this only works under the assumption of none-duplicated properties.
Args:
component_name (str): the name of the component.
Raises:
MatchaError: if the component could not be found.
Returns:
MatchaConfigComponent: the component that matches the component_name parameter.
"""
component = next(
filter(lambda component: component.name == component_name, self.components),
None,
)

if component is None:
raise MatchaError(
f"The component with the name '{component_name}' could not be found."
)

return component

def to_dict(self) -> Dict[str, Dict[str, str]]:
"""A function to convert the MatchaConfig class into a dictionary.
Returns:
Dict[str, Dict[str, str]]: the MatchaState as a dictionary.
"""
state_dictionary = {}
for config_component in self.components:
state_dictionary[config_component.name] = {
property.name: property.value
for property in config_component.properties
}

return state_dictionary

@staticmethod
def from_dict(state_dict: Dict[str, Dict[str, str]]) -> "MatchaConfig":
"""A function to convert a dictionary representation of the Matcha config file into a MatchaConfig instance.
Args:
state_dict (Dict[str, Dict[str, str]]): the dictionary representation of the Matcha config file.
Returns:
MatchaConfig: the MatchaConfig representation of the MatchaConfig instance.
"""
components: List[MatchaConfigComponent] = []
for resource, properties in state_dict.items():
components.append(
MatchaConfigComponent(
name=resource,
properties=[
MatchaConfigComponentProperty(name=key, value=value)
for key, value in properties.items()
],
)
)

return MatchaConfig(components=components)


class MatchaConfigService:
"""A service for handling the Matcha config file."""

@staticmethod
def get_stack() -> Optional[MatchaConfigComponentProperty]:
"""Gets the current stack name from the Matcha Config if it exists.
Returns:
Optional[MatchaConfigComponentProperty]: The name of the current stack being used as a config component object.
"""
try:
stack = (
MatchaConfigService.read_matcha_config()
.find_component("stack")
.find_property("name")
)
except MatchaError:
stack = None

return stack

@staticmethod
def write_matcha_config(matcha_config: MatchaConfig) -> None:
"""A function for writing the local Matcha config file.
Args:
matcha_config (MatchaConfig): the MatchaConfig representation of the MatchaConfig instance.
"""
local_config_file = os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME)

with open(local_config_file, "w") as file:
json.dump(matcha_config.to_dict(), file)

@staticmethod
def read_matcha_config() -> MatchaConfig:
"""A function for reading the Matcha config file into a MatchaConfig object.
Returns:
MatchaConfig: the MatchaConfig representation of the MatchaConfig instance.
Raises:
MatchaError: raises a MatchaError if the local config file could not be read.
"""
local_config_file = os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME)

if os.path.exists(local_config_file):
with open(local_config_file) as config:
local_config = json.load(config)

return MatchaConfig.from_dict(local_config)
else:
raise MatchaError(
f"No '{DEFAULT_CONFIG_NAME}' file found, please generate one by running 'matcha provision', or add an existing ''{DEFAULT_CONFIG_NAME}'' file to the root project directory."
)

@staticmethod
def config_file_exists() -> bool:
"""A convencience function which checks for the existence of the matcha.config.json file.
Returns:
True if the matcha.config.json file exists, False otherwise.
"""
return os.path.exists(os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME))

@staticmethod
def update(
components: Union[MatchaConfigComponent, List[MatchaConfigComponent]]
) -> None:
"""A function which updates the matcha config file.
If no config file exists, this function will create one.
Args:
components (dict): A list of, or single MatchaConfigComponent object(s).
"""
if isinstance(components, MatchaConfigComponent):
components = [components]

if MatchaConfigService.config_file_exists():
config = MatchaConfigService.read_matcha_config()
config.components += components
else:
config = MatchaConfig(components)

MatchaConfigService.write_matcha_config(config)

@staticmethod
def delete_matcha_config() -> None:
"""A function for deleting the local Matcha config file.
Raises:
MatchaError: raises a MatchaError if the local config file could not be removed.
"""
local_config_file = os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME)

try:
os.remove(local_config_file)
except Exception:
raise MatchaError(
f"Local config file at path:{local_config_file} could not be removed."
)
2 changes: 2 additions & 0 deletions src/matcha_ml/core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
get,
provision,
remove_state_lock,
stack_set,
)

__all__ = [
Expand All @@ -15,4 +16,5 @@
"remove_state_lock",
"destroy",
"provision",
"stack_set",
]
Loading

0 comments on commit 65a7795

Please sign in to comment.