This repository contains The Dobblegängers (a.k.a., Fuzzy Labs) submission to ZenML Month of MLOps competition
- The Dobblegängers
- What have we done?
- Code & Repository Structure
- Project Overview
- Setup
- Running the Pipelines
- Blog Posts & Demo
At Fuzzy Labs we're trying to become Dobble world champions. So, we came up with a plan - we've trained an ML model to recognise the common symbol between two cards, and what better way to make it than with a ZenML pipeline.
If you're reading this and wondering: what on earth is Dobble? Let us explain. It's a game of speed and observation where the aim is to be the quickest to identify the common symbol between two cards. If you're the first to find it and name it, then you win the card. Simple, right? It essence, it's a more sophisticated version of snap.
Now that you're all caught up, let's go into a little more detail about what we've done. Obviously as we're wanting to win the world championships, we need a concealable device. So, to also provide an extra challenge, we decided to deploy our model to a NVIDIA Jetson Nano.
This repository contains all the code and resources to set up and run a data pipeline, training pipeline, and inference on the Jetson. It's structured as follows:
.
├── LICENSE
├── pyproject.toml
├── README.md
├── requirements.txt # dependencies required for the project
├── docs # detailed documentation for the project
├── pipelines # all pipelines inside this folder
│ └── training_pipeline
└── training_pipeline.py
└── config_training_pipeline.yaml # each pipeline will have one config file containing information regarding step and other configuration
├── run.py # main file where all pipelines can be run
└── steps # all steps inside this folder
└── data_preprocess # each step is in its own folder (as per ZenML best practises)
└── data_preprocess_step.py
└── src # extra utilities that are required by steps added in this folder
└── zenml_stack_recipes # contains the modified aws-minimal stack recipe
As we've also used some cloud resources to store data and host experiment tracking, we used one of the ZenML stack recipes. There's more information on this here.
To give an overview of our solution (see here for an in-depth description), we've broken this challenge down into three stages, with two pipelines:
This downloads the labelled data, processes it into the correct format for training, and uploads to an S3 bucket.
This pipeline downloads the data, validates the data, trains and evaluates a model, and exports to the correct format for deployment.
Here, the trained model is loaded onto the device and inference is performed in real-time
The first step is creating a virtual environment and install the project requirements, we've used conda
but feel free to use whatever you prefer (as long as you can install a set of requirements):
conda create -n dobble_venv python=3.8 -y
conda activate dobble_venv
pip install -r requirements.txt
The next step is to setup ZenML, with the first step being to install the required integrations:
zenml integrations install -y pytorch mlflow
Initialise the ZenML repository
zenml init
Start the ZenServer
zenml up
Note Visit ZenML dashboard is available at 'http://127.0.0.1:8237'. You can connect to it using the 'default' username and an empty password. If there's a TCP error about port not being available. Run
fuser -k port_no/tcp
to close an open port and runzenml up
command again, for MacOS, runkill $(lsof -t -i:8237)
.
By default, ZenML comes with a stack which runs locally. Next, we add MLflow as an experiment tracker to this local stack, which is we'll run the pipelines:
zenml experiment-tracker register mlflow_tracker --flavor=mlflow
zenml stack register fuzzy_stack \
-a default \
-o default \
-e mlflow_tracker \
--set
You're now in a position where you can run the pipelines locally.
We have a couple of options for running the pipelines, specified by flags:
python run.py -dp # run the data pipeline only
python run.py -tp # run the training pipeline only
python run.py -dp -tp # run both the data and training pipelines
Please see here for a detailed guide on what we've modified in the aws-minimal
stack recipe and how to run it
As part of our submission, we've written a series of blogs on our website. Each of the blogs has an accompanying video.
https://www.youtube.com/watch?v=j9TAVpM5NRQ
https://www.youtube.com/watch?v=djliB4QnuoQ
Video: https://www.youtube.com/watch?v=gCAzpyE0Zr8 Blog: https://www.fuzzylabs.ai/blog-post/zenmls-month-of-mlops-data-science-edition
Blog: https://www.fuzzylabs.ai/blog-post/mlops-pipeline-on-the-edge