diff --git a/integrations/model-training/detectron2/notebooks/Comet_with_Detectron2.ipynb b/integrations/model-training/detectron2/notebooks/Comet_with_Detectron2.ipynb
new file mode 100644
index 0000000..74826fa
--- /dev/null
+++ b/integrations/model-training/detectron2/notebooks/Comet_with_Detectron2.ipynb
@@ -0,0 +1,502 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "QHnVupBBn9eR"
+ },
+ "source": [
+ "# Comet with Detectron2\n",
+ "\n",
+ "
\n",
+ "
\n",
+ "\n",
+ "Detectron2 is Facebook AI Research's next generation library\n",
+ "that provides state-of-the-art detection and segmentation algorithms.\n",
+ "It is the successor of\n",
+ "[Detectron](https://github.com/facebookresearch/Detectron/)\n",
+ "and [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark/).\n",
+ "It supports a number of computer vision research projects and production applications in Facebook.\n",
+ "\n",
+ "\n",
+ "Comet integrates with Detectron 2, allowing you to log your training metrics and images.\n",
+ "\n",
+ "Get a preview for what's to come. Check out a completed experiment created from this notebook [here](https://www.comet.com/examples/comet-example-detectron2-notebook/cb1bb76296c046fc92f433fb6b81adb2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vM54r6jlKTII"
+ },
+ "source": [
+ "# Install detectron2"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "FsePPpwZSmqt"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install 'git+https://github.com/facebookresearch/detectron2.git' torch torchvision \"comet_ml>=3.47.0\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "0d288Z2mF5dC"
+ },
+ "outputs": [],
+ "source": [
+ "import comet_ml\n",
+ "import torch, detectron2\n",
+ "\n",
+ "!nvcc --version\n",
+ "TORCH_VERSION = \".\".join(torch.__version__.split(\".\")[:2])\n",
+ "CUDA_VERSION = torch.__version__.split(\"+\")[-1]\n",
+ "print(\"torch: \", TORCH_VERSION, \"; cuda: \", CUDA_VERSION)\n",
+ "print(\"detectron2:\", detectron2.__version__)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ZyAvNCJMmvFF"
+ },
+ "outputs": [],
+ "source": [
+ "# Some basic setup:\n",
+ "# Setup detectron2 logger\n",
+ "import detectron2\n",
+ "from detectron2.utils.logger import setup_logger\n",
+ "\n",
+ "setup_logger()\n",
+ "\n",
+ "# import some common libraries\n",
+ "import numpy as np\n",
+ "import os, json, cv2, random\n",
+ "from google.colab.patches import cv2_imshow\n",
+ "\n",
+ "# import some common detectron2 utilities\n",
+ "from detectron2 import model_zoo\n",
+ "from detectron2.engine import DefaultPredictor\n",
+ "from detectron2.config import get_cfg\n",
+ "from detectron2.utils.visualizer import Visualizer\n",
+ "from detectron2.data import MetadataCatalog, DatasetCatalog"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Vk4gID50K03a"
+ },
+ "source": [
+ "# Run a pre-trained detectron2 model"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JgKyUL4pngvE"
+ },
+ "source": [
+ "We first download an image from the COCO dataset:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dq9GY37ml1kr"
+ },
+ "outputs": [],
+ "source": [
+ "!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O input.jpg\n",
+ "im = cv2.imread(\"./input.jpg\")\n",
+ "cv2_imshow(im)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "uM1thbN-ntjI"
+ },
+ "source": [
+ "Then, we create a detectron2 config and a detectron2 `DefaultPredictor` to run inference on this image."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "HUjkwRsOn1O0"
+ },
+ "outputs": [],
+ "source": [
+ "cfg = get_cfg()\n",
+ "# add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library\n",
+ "cfg.merge_from_file(\n",
+ " model_zoo.get_config_file(\"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\")\n",
+ ")\n",
+ "cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set threshold for this model|\n",
+ "# Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well\n",
+ "cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(\n",
+ " \"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\"\n",
+ ")\n",
+ "predictor = DefaultPredictor(cfg)\n",
+ "outputs = predictor(im)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "7d3KxiHO_0gb"
+ },
+ "outputs": [],
+ "source": [
+ "# look at the outputs. See https://detectron2.readthedocs.io/tutorials/models.html#model-output-format for specification\n",
+ "print(outputs[\"instances\"].pred_classes)\n",
+ "print(outputs[\"instances\"].pred_boxes)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "8IRGo8d0qkgR"
+ },
+ "outputs": [],
+ "source": [
+ "# We can use `Visualizer` to draw the predictions on the image.\n",
+ "v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)\n",
+ "out = v.draw_instance_predictions(outputs[\"instances\"].to(\"cpu\"))\n",
+ "cv2_imshow(out.get_image()[:, :, ::-1])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "b2bjrfb2LDeo"
+ },
+ "source": [
+ "# Train on a custom dataset"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tjbUIhSxUdm_"
+ },
+ "source": [
+ "In this section, we show how to train an existing detectron2 model on a custom dataset in a new format.\n",
+ "\n",
+ "We use [the balloon segmentation dataset](https://github.com/matterport/Mask_RCNN/tree/master/samples/balloon)\n",
+ "which only has one class: balloon.\n",
+ "We'll train a balloon segmentation model from an existing model pre-trained on COCO dataset, available in detectron2's model zoo.\n",
+ "\n",
+ "Note that COCO dataset does not have the \"balloon\" category. We'll be able to recognize this new class in a few minutes.\n",
+ "\n",
+ "## Prepare the dataset"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "4Qg7zSVOulkb"
+ },
+ "outputs": [],
+ "source": [
+ "# download, decompress the data\n",
+ "!wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/balloon_dataset.zip\n",
+ "!unzip balloon_dataset.zip > /dev/null"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tVJoOm6LVJwW"
+ },
+ "source": [
+ "Register the balloon dataset to detectron2, following the [detectron2 custom dataset tutorial](https://detectron2.readthedocs.io/tutorials/datasets.html).\n",
+ "Here, the dataset is in its custom format, therefore we write a function to parse it and prepare it into detectron2's standard format. User should write such a function when using a dataset in custom format. See the tutorial for more details.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PIbAM2pv-urF"
+ },
+ "outputs": [],
+ "source": [
+ "# if your dataset is in COCO format, this cell can be replaced by the following three lines:\n",
+ "# from detectron2.data.datasets import register_coco_instances\n",
+ "# register_coco_instances(\"my_dataset_train\", {}, \"json_annotation_train.json\", \"path/to/image/dir\")\n",
+ "# register_coco_instances(\"my_dataset_val\", {}, \"json_annotation_val.json\", \"path/to/image/dir\")\n",
+ "\n",
+ "from detectron2.structures import BoxMode\n",
+ "\n",
+ "\n",
+ "def get_balloon_dicts(img_dir):\n",
+ " json_file = os.path.join(img_dir, \"via_region_data.json\")\n",
+ " with open(json_file) as f:\n",
+ " imgs_anns = json.load(f)\n",
+ "\n",
+ " dataset_dicts = []\n",
+ " for idx, v in enumerate(imgs_anns.values()):\n",
+ " record = {}\n",
+ "\n",
+ " filename = os.path.join(img_dir, v[\"filename\"])\n",
+ " height, width = cv2.imread(filename).shape[:2]\n",
+ "\n",
+ " record[\"file_name\"] = filename\n",
+ " record[\"image_id\"] = idx\n",
+ " record[\"height\"] = height\n",
+ " record[\"width\"] = width\n",
+ "\n",
+ " annos = v[\"regions\"]\n",
+ " objs = []\n",
+ " for _, anno in annos.items():\n",
+ " assert not anno[\"region_attributes\"]\n",
+ " anno = anno[\"shape_attributes\"]\n",
+ " px = anno[\"all_points_x\"]\n",
+ " py = anno[\"all_points_y\"]\n",
+ " poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]\n",
+ " poly = [p for x in poly for p in x]\n",
+ "\n",
+ " obj = {\n",
+ " \"bbox\": [np.min(px), np.min(py), np.max(px), np.max(py)],\n",
+ " \"bbox_mode\": BoxMode.XYXY_ABS,\n",
+ " \"segmentation\": [poly],\n",
+ " \"category_id\": 0,\n",
+ " }\n",
+ " objs.append(obj)\n",
+ " record[\"annotations\"] = objs\n",
+ " dataset_dicts.append(record)\n",
+ " return dataset_dicts\n",
+ "\n",
+ "\n",
+ "for d in [\"train\", \"val\"]:\n",
+ " DatasetCatalog.register(\n",
+ " \"balloon_\" + d, lambda d=d: get_balloon_dicts(\"balloon/\" + d)\n",
+ " )\n",
+ " MetadataCatalog.get(\"balloon_\" + d).set(thing_classes=[\"balloon\"])\n",
+ "balloon_metadata = MetadataCatalog.get(\"balloon_train\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6ljbWTX0Wi8E"
+ },
+ "source": [
+ "To verify the dataset is in correct format, let's visualize the annotations of randomly selected samples in the training set:\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "UkNbUzUOLYf0"
+ },
+ "outputs": [],
+ "source": [
+ "dataset_dicts = get_balloon_dicts(\"balloon/train\")\n",
+ "for d in random.sample(dataset_dicts, 3):\n",
+ " img = cv2.imread(d[\"file_name\"])\n",
+ " visualizer = Visualizer(img[:, :, ::-1], metadata=balloon_metadata, scale=0.5)\n",
+ " out = visualizer.draw_dataset_dict(d)\n",
+ " cv2_imshow(out.get_image()[:, :, ::-1])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "wlqXIXXhW8dA"
+ },
+ "source": [
+ "## Train!\n",
+ "\n",
+ "Now, let's fine-tune a COCO-pretrained R50-FPN Mask R-CNN model on the balloon dataset. It takes ~2 minutes to train 300 iterations on a P100 GPU.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "7unkuuiqLdqd"
+ },
+ "outputs": [],
+ "source": [
+ "import comet_ml\n",
+ "from detectron2.engine import DefaultTrainer\n",
+ "\n",
+ "comet_ml.login()\n",
+ "\n",
+ "experiment = comet_ml.start(project_name=\"comet-example-detectron2-notebook\")\n",
+ "\n",
+ "cfg = get_cfg()\n",
+ "cfg.merge_from_file(\n",
+ " model_zoo.get_config_file(\"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\")\n",
+ ")\n",
+ "cfg.DATASETS.TRAIN = (\"balloon_train\",)\n",
+ "cfg.DATASETS.TEST = ()\n",
+ "cfg.DATALOADER.NUM_WORKERS = 2\n",
+ "cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(\n",
+ " \"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\"\n",
+ ") # Let training initialize from model zoo\n",
+ "cfg.SOLVER.IMS_PER_BATCH = (\n",
+ " 2 # This is the real \"batch size\" commonly known to deep learning people\n",
+ ")\n",
+ "cfg.SOLVER.BASE_LR = 0.00025 # pick a good LR\n",
+ "cfg.SOLVER.MAX_ITER = 300 # 300 iterations seems good enough for this toy dataset; you will need to train longer for a practical dataset\n",
+ "cfg.SOLVER.STEPS = [] # do not decay learning rate\n",
+ "cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 # The \"RoIHead batch size\". 128 is faster, and good enough for this toy dataset (default: 512)\n",
+ "cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1 # only has one class (ballon). (see https://detectron2.readthedocs.io/tutorials/datasets.html#update-the-config-for-new-datasets)\n",
+ "# NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.\n",
+ "\n",
+ "os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)\n",
+ "trainer = DefaultTrainer(cfg)\n",
+ "trainer.resume_or_load(resume=False)\n",
+ "trainer.train()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0e4vdDIOXyxF"
+ },
+ "source": [
+ "## Inference & evaluation using the trained model\n",
+ "Now, let's run inference with the trained model on the balloon validation dataset. First, let's create a predictor using the model we just trained:\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Ya5nEuMELeq8"
+ },
+ "outputs": [],
+ "source": [
+ "# Inference should use the config with parameters that are used in training\n",
+ "# cfg now already contains everything we've set previously. We changed it a little bit for inference:\n",
+ "cfg.MODEL.WEIGHTS = os.path.join(\n",
+ " cfg.OUTPUT_DIR, \"model_final.pth\"\n",
+ ") # path to the model we just trained\n",
+ "cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set a custom testing threshold\n",
+ "predictor = DefaultPredictor(cfg)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "qWq1XHfDWiXO"
+ },
+ "source": [
+ "Then, we randomly select several samples to visualize the prediction results."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "U5LhISJqWXgM"
+ },
+ "outputs": [],
+ "source": [
+ "from detectron2.utils.visualizer import ColorMode\n",
+ "\n",
+ "dataset_dicts = get_balloon_dicts(\"balloon/val\")\n",
+ "for d in random.sample(dataset_dicts, 3):\n",
+ " im = cv2.imread(d[\"file_name\"])\n",
+ " outputs = predictor(\n",
+ " im\n",
+ " ) # format is documented at https://detectron2.readthedocs.io/tutorials/models.html#model-output-format\n",
+ " v = Visualizer(\n",
+ " im[:, :, ::-1],\n",
+ " metadata=balloon_metadata,\n",
+ " scale=0.5,\n",
+ " instance_mode=ColorMode.IMAGE_BW, # remove the colors of unsegmented pixels. This option is only available for segmentation models\n",
+ " )\n",
+ " out = v.draw_instance_predictions(outputs[\"instances\"].to(\"cpu\"))\n",
+ " cv2_imshow(out.get_image()[:, :, ::-1])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kblA1IyFvWbT"
+ },
+ "source": [
+ "We can also evaluate its performance using AP metric implemented in COCO API.\n",
+ "This gives an AP of ~70. Not bad!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "h9tECBQCvMv3"
+ },
+ "outputs": [],
+ "source": [
+ "from detectron2.evaluation import COCOEvaluator, inference_on_dataset\n",
+ "from detectron2.data import build_detection_test_loader\n",
+ "\n",
+ "evaluator = COCOEvaluator(\"balloon_val\", output_dir=\"./output\")\n",
+ "val_loader = build_detection_test_loader(cfg, \"balloon_val\")\n",
+ "print(inference_on_dataset(predictor.model, val_loader, evaluator))\n",
+ "# another equivalent way to evaluate the model is to use `trainer.test`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XgisdpqDYiFv"
+ },
+ "outputs": [],
+ "source": [
+ "# Make sure that everything is logged to comet\n",
+ "comet_ml.end()"
+ ]
+ }
+ ],
+ "metadata": {
+ "accelerator": "GPU",
+ "colab": {
+ "gpuType": "T4",
+ "provenance": [],
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.1"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}