Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLflow + Ray Serve: Update header levels #265

Merged
merged 2 commits into from
Jan 10, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions notebooks/en/mlflow_ray_serve.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,11 @@
"id": "IuS0daXP1lIa"
},
"source": [
"# Introduction\n",
"## Introduction\n",
"\n",
"This notebook explores solutions for streamlining the deployment of models from a model registry. For teams that want to productionize many models over time, investments at this \"transition point\" in the AI/ML project lifecycle can meaningfully drive down time-to-production. This can be important for a younger, smaller team that may not have the benefit of existing infrastructure to form a \"golden path\" for serving online models in production.\n",
"\n",
"# Motivation\n",
"## Motivation\n",
"\n",
"Optimizing this stage of the model lifecycle is particularly important due to the production-facing aspect of the end result. At this stage, your model becomes, in effect, a microservice. This means that you now need to contend with all elements of service ownership, which can include:\n",
"\n",
Expand All @@ -52,7 +52,7 @@
"id": "fXlB7AJr2foY"
},
"source": [
"# Components\n",
"## Components\n",
"\n",
"For our exploration here, we'll use the following minimal stack:\n",
"\n",
Expand Down Expand Up @@ -87,7 +87,7 @@
"id": "C0UziXBN4Szc"
},
"source": [
"# Register the Model\n",
"## Register the Model\n",
"\n",
"First, let's define the model that we'll use for our exploration today. For simplicity's sake, we'll use a simple text translation model, where the source and destination languages are configurable at registration time. In effect, this means that different \"versions\" of the model can be registered to translate different languages, but the underlying model architecture and weights can stay the same."
]
Expand Down Expand Up @@ -232,7 +232,7 @@
"id": "iwa3o-0B9FPO"
},
"source": [
"# Serve the Model\n",
"## Serve the Model\n",
"\n",
"Now that our model is registered in MLflow, let's set up our serving scaffolding using [Ray Serve](https://docs.ray.io/en/latest/serve/index.html). For now, we'll limit our \"deployment\" to the following behavior:\n",
"\n",
Expand Down Expand Up @@ -349,7 +349,7 @@
"id": "hsJ65rNNDMVj"
},
"source": [
"# Multiple Versions, One Endpoint\n",
"## Multiple Versions, One Endpoint\n",
"\n",
"Now we've got a basic endpoint set up for our model. Great! However, notice that this deployment is strictly tethered to a single version of this model -- specifically, version `1` of the registered `translation_model`.\n",
"\n",
Expand Down Expand Up @@ -980,7 +980,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusion\n",
"## Conclusion\n",
"\n",
"In this notebook, we've leveraged MLflow's built-in support for tracking model signatures to heavily streamline the process of deploying an HTTP server to serve that model in online fashion. We've taken Ray Serve's powerful-but-fiddly primitives to empower ourselves to, in one line, deploy a model server with:\n",
"\n",
Expand Down