[Feature] LLM Stack (#191)

* [RPD-282] adds stack cli command (#179) * adds tests and logic for stack and stack set cli commands * fixes typos * updated for comments * addresses comments * [RPD-285] Create LLM stack Terraform files (#182) * Add initial LLM Terraform files * Update Terraform Kubernetes Chroma deployment * Update Terraform docs * Revert default variables * Add missing aks module * [RPD-283] add core function for updating stack type (#185) * [RPD-287] ZenML version inference for zenserver (#180) * various changes * change variable names to make things more clear * shubham's comment * test for latest * update existing tests * update existing tests * shubham latest comment * add gitflow notation branches to ci (#183) * remove old, commented out code * [RPD-260] Add an object to handle the `matcha.config.json` file. (#184) * adds logic and tests for matcha config module * updates docstrings * adds tests and implements config object throughout matcha * updated for pr comments * updates docstring * fixes ci * updates for comments * enum and metaclass * enum and metaclass * bug removal * docstring stack_set * docstring file exists * fix current tests * add a couple tests, move logic for updating to configservice * lowercase-ify arg to enum * american spelling * american spelling * chris' comment re overwriting * friendlier api * Runtime error if not recognised arg type fpor update() * Runtime error if not recognised arg type fpor update() * update provision to use new update() API * update stack name * tests * tests * tests * quote marks * type version, mypy * various review comments * test for update() function * update tests to include quotes. * fixing tests * tidy tests, test error if resources provisioned * clearer error handling * remove unnecessary context * update test to new error type --------- Co-authored-by: Callum Wells <[email protected]> * [RPD-286] Add documentation on "stacks" and the new LLM stack (#189) * Initial version of docs * Add motivation to stacks docs * Various docs updates based on reviews * [RPD-284] Update AzureRunner to select the Terraform files corresponding to the stack name in the config file (#187) * [RPD-287] ZenML version inference for zenserver (#180) * various changes * change variable names to make things more clear * shubham's comment * test for latest * update existing tests * update existing tests * shubham latest comment * add gitflow notation branches to ci (#183) * remove old, commented out code * [RPD-260] Add an object to handle the `matcha.config.json` file. (#184) * adds logic and tests for matcha config module * updates docstrings * adds tests and implements config object throughout matcha * updated for pr comments * updates docstring * fixes ci * updates for comments * enum and metaclass * enum and metaclass * bug removal * docstring stack_set * docstring file exists * fix current tests * add a couple tests, move logic for updating to configservice * lowercase-ify arg to enum * american spelling * american spelling * chris' comment re overwriting * friendlier api * Runtime error if not recognised arg type fpor update() * Runtime error if not recognised arg type fpor update() * update provision to use new update() API * update stack name * tests * tests * tests * Move files for selection * Clean up file * Remove a test that no longer describes the expected functionality * Remove a test that no longer describes the expected functionality * Fix tests * Stop creation of config file in local directory when running tests * Update check to use MatchaConfig object instead of to dict * Add get current stack name function to MatchaConfigService * Update get stack function * Fix test * Remove unnecessary lower casing * Fix tests * updates stack handling * Revert "updates stack handling" This reverts commit b42b31d. * [RPD-289] Update the Chroma Terraform within the LLM stack to use Helm #194 * [RPD-292] [BUG] Update AzureTemplate to not create redundant folders during provisioning (#195) * splits default stack * updates llm stack * [RPD-303] Update stack_set docstring to include example and raises (#196) --------- Co-authored-by: Callum Wells <[email protected]> Co-authored-by: KirsoppJ <[email protected]> Co-authored-by: Jonathan Carlton <[email protected]>
fuzzylabs · Aug 15, 2023 · 65a7795 · 65a7795
1 parent dc02ab3
commit 65a7795
Show file tree

Hide file tree

Showing 168 changed files with 4,786 additions and 710 deletions.
diff --git a/docs/resource-stacks.md b/docs/resource-stacks.md
@@ -0,0 +1,52 @@
+# Resource Stacks 📚
+
+Machine Learning projects often vary in their size, from small-scale experimentation to large deployments, meaning that the infrastructure requirements also change and scale. For example, the infrastructure stack needed for deploying an LLM may require a GPU or vector database, which aren't usually needed in more general machine learning use-cases.
+
+Matcha accommodates both of these requirements, and currently offers two infrastructure stacks which we'll discuss in more detail here and show how you can get started with either.
+
+> Note: These stacks must be set before provisioning any resources and cannot be change whilst a Matcha deployment exists.
+
+## Available stacks
+
+### DEFAULT
+
+The `DEFAULT` stack. This stack is ideal for generic machine learning training and deployments and a good starting point. It includes:
+   * [Azure Kubernetes Service](https://azure.microsoft.com/en-gb/products/kubernetes-service)
+   * [ZenML](https://www.zenml.io/home)
+   * [Seldon Core](https://www.seldon.io/solutions/open-source-projects/core) (deployment)
+   * [MLflow](https://mlflow.org/) (experiment tracking)
+   * Data version control storage bucket
+
+This is the stack used in the [getting started page](getting-started.md). Follow the link for more information.
+
+### LLM
+
+The `LLM` stack: This includes everything found within the `DEFAULT` stack with the addition of a vector database - Chroma DB. This stack is modified for the training and deployment of Large Language Models (LLMs).
+
+   * [Azure Kubernetes Service](https://azure.microsoft.com/en-gb/products/kubernetes-service)
+   * [ZenML](https://www.zenml.io/home)
+   * [Seldon Core](https://www.seldon.io/solutions/open-source-projects/core) (deployment)
+   * [MLflow](https://mlflow.org/) (experiment tracking)
+   * Data version control storage bucket
+   * [Chroma DB](https://www.trychroma.com/) (vector database for document retrieval)
+
+
+We use this stack for [MindGPT](https://github.com/fuzzylabs/MindGPT), our large language model for mental health question answering.
+
+## How to switch your stack
+
+To switch your stack to the 'DEFAULT' stack, run the following command:
+
+```bash
+$ matcha stack set default
+```
+
+or for the 'LLM' stack:
+
+```bash
+$ matcha stack set llm
+```
+
+If no stack is set Matcha will use the 'default' stack.
+
+See the [API documentation](references.md) for more information.
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -14,6 +14,7 @@ nav:
     - Azure Costs: 'costings.md'
   - Inside Matcha:
     - How does Matcha work: 'inside-matcha.md'
+    - Resource stacks: 'resource-stacks.md'
     - Why we collect usage data: 'privacy.md'
   - Tools:
     - Data Version Control: 'data-version-control.md'

diff --git a/poetry.lock b/poetry.lock
diff --git a/src/matcha_ml/cli/cli.py b/src/matcha_ml/cli/cli.py
@@ -27,11 +27,17 @@
 
 app = typer.Typer(no_args_is_help=True, pretty_exceptions_show_locals=False)
 analytics_app = typer.Typer(no_args_is_help=True, pretty_exceptions_show_locals=False)
+stack_app = typer.Typer(no_args_is_help=True, pretty_exceptions_show_locals=False)
 app.add_typer(
     analytics_app,
     name="analytics",
     help="Enable or disable the collection of anonymous usage data (enabled by default).",
 )
+app.add_typer(
+    stack_app,
+    name="stack",
+    help="Configure the stack for Matcha to provision.",
+)
 
 
 def fill_provision_variables(
@@ -242,5 +248,23 @@ def opt_in() -> None:
     core.analytics_opt_in()
 
 
+@stack_app.command(help="Define the stack for Matcha to provision.")
+def set(stack: str = typer.Argument("default")) -> None:
+    """Define the stack for Matcha to provision.
+
+    Args:
+        stack (str): the name of the stack to provision.
+    """
+    try:
+        core.stack_set(stack)
+        print_status(build_status(f"Matcha '{stack}' stack has been set."))
+    except MatchaInputError as e:
+        print_error(str(e))
+        raise typer.Exit()
+    except MatchaError as e:
+        print_error(str(e))
+        raise typer.Exit()
+
+
 if __name__ == "__main__":
     app()
diff --git a/src/matcha_ml/config/__init__.py b/src/matcha_ml/config/__init__.py
@@ -0,0 +1,16 @@
+"""Matcha state sub-module."""
+from .matcha_config import (
+    DEFAULT_CONFIG_NAME,
+    MatchaConfig,
+    MatchaConfigComponent,
+    MatchaConfigComponentProperty,
+    MatchaConfigService,
+)
+
+__all__ = [
+    "MatchaConfigService",
+    "MatchaConfig",
+    "MatchaConfigComponentProperty",
+    "MatchaConfigComponent",
+    "DEFAULT_CONFIG_NAME",
+]
diff --git a/src/matcha_ml/config/matcha_config.py b/src/matcha_ml/config/matcha_config.py
@@ -0,0 +1,226 @@
+"""The matcha.config.json file interface."""
+import json
+import os
+from dataclasses import dataclass
+from typing import Dict, List, Optional, Union
+
+from matcha_ml.errors import MatchaError
+
+DEFAULT_CONFIG_NAME = "matcha.config.json"
+
+
+@dataclass
+class MatchaConfigComponentProperty:
+    """A class to represent Matcha config properties."""
+
+    name: str
+    value: str
+
+
+@dataclass
+class MatchaConfigComponent:
+    """A class to represent Matcha config components."""
+
+    name: str
+    properties: List[MatchaConfigComponentProperty]
+
+    def find_property(self, property_name: str) -> MatchaConfigComponentProperty:
+        """Given a property name, find the property that matches it.
+
+        Note: this only works under the assumption of none-duplicated properties.
+
+        Args:
+            property_name (str): the name of the property.
+
+        Raises:
+            MatchaError: if the property could not be found.
+
+        Returns:
+            MatchaConfigComponentProperty: the property that matches the property_name parameter.
+        """
+        property = next(
+            filter(lambda property: property.name == property_name, self.properties),
+            None,
+        )
+
+        if property is None:
+            raise MatchaError(
+                f"The property with the name '{property_name}' could not be found."
+            )
+
+        return property
+
+
+@dataclass
+class MatchaConfig:
+    """A class to represent the Matcha config file."""
+
+    components: List[MatchaConfigComponent]
+
+    def find_component(self, component_name: str) -> MatchaConfigComponent:
+        """Given a component name, find the component that matches it.
+
+        Note: this only works under the assumption of none-duplicated properties.
+
+        Args:
+            component_name (str): the name of the component.
+
+        Raises:
+            MatchaError: if the component could not be found.
+
+        Returns:
+            MatchaConfigComponent: the component that matches the component_name parameter.
+        """
+        component = next(
+            filter(lambda component: component.name == component_name, self.components),
+            None,
+        )
+
+        if component is None:
+            raise MatchaError(
+                f"The component with the name '{component_name}' could not be found."
+            )
+
+        return component
+
+    def to_dict(self) -> Dict[str, Dict[str, str]]:
+        """A function to convert the MatchaConfig class into a dictionary.
+
+        Returns:
+            Dict[str, Dict[str, str]]: the MatchaState as a dictionary.
+        """
+        state_dictionary = {}
+        for config_component in self.components:
+            state_dictionary[config_component.name] = {
+                property.name: property.value
+                for property in config_component.properties
+            }
+
+        return state_dictionary
+
+    @staticmethod
+    def from_dict(state_dict: Dict[str, Dict[str, str]]) -> "MatchaConfig":
+        """A function to convert a dictionary representation of the Matcha config file into a MatchaConfig instance.
+
+        Args:
+            state_dict (Dict[str, Dict[str, str]]): the dictionary representation of the Matcha config file.
+
+        Returns:
+            MatchaConfig: the MatchaConfig representation of the MatchaConfig instance.
+        """
+        components: List[MatchaConfigComponent] = []
+        for resource, properties in state_dict.items():
+            components.append(
+                MatchaConfigComponent(
+                    name=resource,
+                    properties=[
+                        MatchaConfigComponentProperty(name=key, value=value)
+                        for key, value in properties.items()
+                    ],
+                )
+            )
+
+        return MatchaConfig(components=components)
+
+
+class MatchaConfigService:
+    """A service for handling the Matcha config file."""
+
+    @staticmethod
+    def get_stack() -> Optional[MatchaConfigComponentProperty]:
+        """Gets the current stack name from the Matcha Config if it exists.
+
+        Returns:
+            Optional[MatchaConfigComponentProperty]: The name of the current stack being used as a config component object.
+        """
+        try:
+            stack = (
+                MatchaConfigService.read_matcha_config()
+                .find_component("stack")
+                .find_property("name")
+            )
+        except MatchaError:
+            stack = None
+
+        return stack
+
+    @staticmethod
+    def write_matcha_config(matcha_config: MatchaConfig) -> None:
+        """A function for writing the local Matcha config file.
+
+        Args:
+            matcha_config (MatchaConfig): the MatchaConfig representation of the MatchaConfig instance.
+        """
+        local_config_file = os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME)
+
+        with open(local_config_file, "w") as file:
+            json.dump(matcha_config.to_dict(), file)
+
+    @staticmethod
+    def read_matcha_config() -> MatchaConfig:
+        """A function for reading the Matcha config file into a MatchaConfig object.
+
+        Returns:
+           MatchaConfig: the MatchaConfig representation of the MatchaConfig instance.
+
+        Raises:
+            MatchaError: raises a MatchaError if the local config file could not be read.
+        """
+        local_config_file = os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME)
+
+        if os.path.exists(local_config_file):
+            with open(local_config_file) as config:
+                local_config = json.load(config)
+
+            return MatchaConfig.from_dict(local_config)
+        else:
+            raise MatchaError(
+                f"No '{DEFAULT_CONFIG_NAME}' file found, please generate one by running 'matcha provision', or add an existing ''{DEFAULT_CONFIG_NAME}'' file to the root project directory."
+            )
+
+    @staticmethod
+    def config_file_exists() -> bool:
+        """A convencience function which checks for the existence of the matcha.config.json file.
+
+        Returns:
+            True if the matcha.config.json file exists, False otherwise.
+        """
+        return os.path.exists(os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME))
+
+    @staticmethod
+    def update(
+        components: Union[MatchaConfigComponent, List[MatchaConfigComponent]]
+    ) -> None:
+        """A function which updates the matcha config file.
+
+        If no config file exists, this function will create one.
+
+        Args:
+            components (dict): A list of, or single MatchaConfigComponent object(s).
+        """
+        if isinstance(components, MatchaConfigComponent):
+            components = [components]
+
+        if MatchaConfigService.config_file_exists():
+            config = MatchaConfigService.read_matcha_config()
+            config.components += components
+        else:
+            config = MatchaConfig(components)
+
+        MatchaConfigService.write_matcha_config(config)
+
+    @staticmethod
+    def delete_matcha_config() -> None:
+        """A function for deleting the local Matcha config file.
+
+        Raises:
+            MatchaError: raises a MatchaError if the local config file could not be removed.
+        """
+        local_config_file = os.path.join(os.getcwd(), DEFAULT_CONFIG_NAME)
+
+        try:
+            os.remove(local_config_file)
+        except Exception:
+            raise MatchaError(
+                f"Local config file at path:{local_config_file} could not be removed."
+            )
diff --git a/src/matcha_ml/core/__init__.py b/src/matcha_ml/core/__init__.py
@@ -6,6 +6,7 @@
     get,
     provision,
     remove_state_lock,
+    stack_set,
 )
 
 __all__ = [
@@ -15,4 +16,5 @@
     "remove_state_lock",
     "destroy",
     "provision",
+    "stack_set",
 ]