diff --git a/integrations/model-training/detectron2/notebooks/Comet_with_Detectron2.ipynb b/integrations/model-training/detectron2/notebooks/Comet_with_Detectron2.ipynb new file mode 100644 index 0000000..74826fa --- /dev/null +++ b/integrations/model-training/detectron2/notebooks/Comet_with_Detectron2.ipynb @@ -0,0 +1,502 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "QHnVupBBn9eR" + }, + "source": [ + "# Comet with Detectron2\n", + "\n", + "\n", + "\n", + "\n", + "Detectron2 is Facebook AI Research's next generation library\n", + "that provides state-of-the-art detection and segmentation algorithms.\n", + "It is the successor of\n", + "[Detectron](https://github.com/facebookresearch/Detectron/)\n", + "and [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark/).\n", + "It supports a number of computer vision research projects and production applications in Facebook.\n", + "\n", + "\n", + "Comet integrates with Detectron 2, allowing you to log your training metrics and images.\n", + "\n", + "Get a preview for what's to come. Check out a completed experiment created from this notebook [here](https://www.comet.com/examples/comet-example-detectron2-notebook/cb1bb76296c046fc92f433fb6b81adb2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "vM54r6jlKTII" + }, + "source": [ + "# Install detectron2" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "FsePPpwZSmqt" + }, + "outputs": [], + "source": [ + "%pip install 'git+https://github.com/facebookresearch/detectron2.git' torch torchvision \"comet_ml>=3.47.0\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "0d288Z2mF5dC" + }, + "outputs": [], + "source": [ + "import comet_ml\n", + "import torch, detectron2\n", + "\n", + "!nvcc --version\n", + "TORCH_VERSION = \".\".join(torch.__version__.split(\".\")[:2])\n", + "CUDA_VERSION = torch.__version__.split(\"+\")[-1]\n", + "print(\"torch: \", TORCH_VERSION, \"; cuda: \", CUDA_VERSION)\n", + "print(\"detectron2:\", detectron2.__version__)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ZyAvNCJMmvFF" + }, + "outputs": [], + "source": [ + "# Some basic setup:\n", + "# Setup detectron2 logger\n", + "import detectron2\n", + "from detectron2.utils.logger import setup_logger\n", + "\n", + "setup_logger()\n", + "\n", + "# import some common libraries\n", + "import numpy as np\n", + "import os, json, cv2, random\n", + "from google.colab.patches import cv2_imshow\n", + "\n", + "# import some common detectron2 utilities\n", + "from detectron2 import model_zoo\n", + "from detectron2.engine import DefaultPredictor\n", + "from detectron2.config import get_cfg\n", + "from detectron2.utils.visualizer import Visualizer\n", + "from detectron2.data import MetadataCatalog, DatasetCatalog" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Vk4gID50K03a" + }, + "source": [ + "# Run a pre-trained detectron2 model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JgKyUL4pngvE" + }, + "source": [ + "We first download an image from the COCO dataset:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "dq9GY37ml1kr" + }, + "outputs": [], + "source": [ + "!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O input.jpg\n", + "im = cv2.imread(\"./input.jpg\")\n", + "cv2_imshow(im)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "uM1thbN-ntjI" + }, + "source": [ + "Then, we create a detectron2 config and a detectron2 `DefaultPredictor` to run inference on this image." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HUjkwRsOn1O0" + }, + "outputs": [], + "source": [ + "cfg = get_cfg()\n", + "# add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library\n", + "cfg.merge_from_file(\n", + " model_zoo.get_config_file(\"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\")\n", + ")\n", + "cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set threshold for this model|\n", + "# Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well\n", + "cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(\n", + " \"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\"\n", + ")\n", + "predictor = DefaultPredictor(cfg)\n", + "outputs = predictor(im)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "7d3KxiHO_0gb" + }, + "outputs": [], + "source": [ + "# look at the outputs. See https://detectron2.readthedocs.io/tutorials/models.html#model-output-format for specification\n", + "print(outputs[\"instances\"].pred_classes)\n", + "print(outputs[\"instances\"].pred_boxes)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "8IRGo8d0qkgR" + }, + "outputs": [], + "source": [ + "# We can use `Visualizer` to draw the predictions on the image.\n", + "v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)\n", + "out = v.draw_instance_predictions(outputs[\"instances\"].to(\"cpu\"))\n", + "cv2_imshow(out.get_image()[:, :, ::-1])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "b2bjrfb2LDeo" + }, + "source": [ + "# Train on a custom dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tjbUIhSxUdm_" + }, + "source": [ + "In this section, we show how to train an existing detectron2 model on a custom dataset in a new format.\n", + "\n", + "We use [the balloon segmentation dataset](https://github.com/matterport/Mask_RCNN/tree/master/samples/balloon)\n", + "which only has one class: balloon.\n", + "We'll train a balloon segmentation model from an existing model pre-trained on COCO dataset, available in detectron2's model zoo.\n", + "\n", + "Note that COCO dataset does not have the \"balloon\" category. We'll be able to recognize this new class in a few minutes.\n", + "\n", + "## Prepare the dataset" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "4Qg7zSVOulkb" + }, + "outputs": [], + "source": [ + "# download, decompress the data\n", + "!wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/balloon_dataset.zip\n", + "!unzip balloon_dataset.zip > /dev/null" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tVJoOm6LVJwW" + }, + "source": [ + "Register the balloon dataset to detectron2, following the [detectron2 custom dataset tutorial](https://detectron2.readthedocs.io/tutorials/datasets.html).\n", + "Here, the dataset is in its custom format, therefore we write a function to parse it and prepare it into detectron2's standard format. User should write such a function when using a dataset in custom format. See the tutorial for more details.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PIbAM2pv-urF" + }, + "outputs": [], + "source": [ + "# if your dataset is in COCO format, this cell can be replaced by the following three lines:\n", + "# from detectron2.data.datasets import register_coco_instances\n", + "# register_coco_instances(\"my_dataset_train\", {}, \"json_annotation_train.json\", \"path/to/image/dir\")\n", + "# register_coco_instances(\"my_dataset_val\", {}, \"json_annotation_val.json\", \"path/to/image/dir\")\n", + "\n", + "from detectron2.structures import BoxMode\n", + "\n", + "\n", + "def get_balloon_dicts(img_dir):\n", + " json_file = os.path.join(img_dir, \"via_region_data.json\")\n", + " with open(json_file) as f:\n", + " imgs_anns = json.load(f)\n", + "\n", + " dataset_dicts = []\n", + " for idx, v in enumerate(imgs_anns.values()):\n", + " record = {}\n", + "\n", + " filename = os.path.join(img_dir, v[\"filename\"])\n", + " height, width = cv2.imread(filename).shape[:2]\n", + "\n", + " record[\"file_name\"] = filename\n", + " record[\"image_id\"] = idx\n", + " record[\"height\"] = height\n", + " record[\"width\"] = width\n", + "\n", + " annos = v[\"regions\"]\n", + " objs = []\n", + " for _, anno in annos.items():\n", + " assert not anno[\"region_attributes\"]\n", + " anno = anno[\"shape_attributes\"]\n", + " px = anno[\"all_points_x\"]\n", + " py = anno[\"all_points_y\"]\n", + " poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]\n", + " poly = [p for x in poly for p in x]\n", + "\n", + " obj = {\n", + " \"bbox\": [np.min(px), np.min(py), np.max(px), np.max(py)],\n", + " \"bbox_mode\": BoxMode.XYXY_ABS,\n", + " \"segmentation\": [poly],\n", + " \"category_id\": 0,\n", + " }\n", + " objs.append(obj)\n", + " record[\"annotations\"] = objs\n", + " dataset_dicts.append(record)\n", + " return dataset_dicts\n", + "\n", + "\n", + "for d in [\"train\", \"val\"]:\n", + " DatasetCatalog.register(\n", + " \"balloon_\" + d, lambda d=d: get_balloon_dicts(\"balloon/\" + d)\n", + " )\n", + " MetadataCatalog.get(\"balloon_\" + d).set(thing_classes=[\"balloon\"])\n", + "balloon_metadata = MetadataCatalog.get(\"balloon_train\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6ljbWTX0Wi8E" + }, + "source": [ + "To verify the dataset is in correct format, let's visualize the annotations of randomly selected samples in the training set:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "UkNbUzUOLYf0" + }, + "outputs": [], + "source": [ + "dataset_dicts = get_balloon_dicts(\"balloon/train\")\n", + "for d in random.sample(dataset_dicts, 3):\n", + " img = cv2.imread(d[\"file_name\"])\n", + " visualizer = Visualizer(img[:, :, ::-1], metadata=balloon_metadata, scale=0.5)\n", + " out = visualizer.draw_dataset_dict(d)\n", + " cv2_imshow(out.get_image()[:, :, ::-1])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wlqXIXXhW8dA" + }, + "source": [ + "## Train!\n", + "\n", + "Now, let's fine-tune a COCO-pretrained R50-FPN Mask R-CNN model on the balloon dataset. It takes ~2 minutes to train 300 iterations on a P100 GPU.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "7unkuuiqLdqd" + }, + "outputs": [], + "source": [ + "import comet_ml\n", + "from detectron2.engine import DefaultTrainer\n", + "\n", + "comet_ml.login()\n", + "\n", + "experiment = comet_ml.start(project_name=\"comet-example-detectron2-notebook\")\n", + "\n", + "cfg = get_cfg()\n", + "cfg.merge_from_file(\n", + " model_zoo.get_config_file(\"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\")\n", + ")\n", + "cfg.DATASETS.TRAIN = (\"balloon_train\",)\n", + "cfg.DATASETS.TEST = ()\n", + "cfg.DATALOADER.NUM_WORKERS = 2\n", + "cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(\n", + " \"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\"\n", + ") # Let training initialize from model zoo\n", + "cfg.SOLVER.IMS_PER_BATCH = (\n", + " 2 # This is the real \"batch size\" commonly known to deep learning people\n", + ")\n", + "cfg.SOLVER.BASE_LR = 0.00025 # pick a good LR\n", + "cfg.SOLVER.MAX_ITER = 300 # 300 iterations seems good enough for this toy dataset; you will need to train longer for a practical dataset\n", + "cfg.SOLVER.STEPS = [] # do not decay learning rate\n", + "cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 # The \"RoIHead batch size\". 128 is faster, and good enough for this toy dataset (default: 512)\n", + "cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1 # only has one class (ballon). (see https://detectron2.readthedocs.io/tutorials/datasets.html#update-the-config-for-new-datasets)\n", + "# NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.\n", + "\n", + "os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)\n", + "trainer = DefaultTrainer(cfg)\n", + "trainer.resume_or_load(resume=False)\n", + "trainer.train()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0e4vdDIOXyxF" + }, + "source": [ + "## Inference & evaluation using the trained model\n", + "Now, let's run inference with the trained model on the balloon validation dataset. First, let's create a predictor using the model we just trained:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Ya5nEuMELeq8" + }, + "outputs": [], + "source": [ + "# Inference should use the config with parameters that are used in training\n", + "# cfg now already contains everything we've set previously. We changed it a little bit for inference:\n", + "cfg.MODEL.WEIGHTS = os.path.join(\n", + " cfg.OUTPUT_DIR, \"model_final.pth\"\n", + ") # path to the model we just trained\n", + "cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set a custom testing threshold\n", + "predictor = DefaultPredictor(cfg)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qWq1XHfDWiXO" + }, + "source": [ + "Then, we randomly select several samples to visualize the prediction results." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "U5LhISJqWXgM" + }, + "outputs": [], + "source": [ + "from detectron2.utils.visualizer import ColorMode\n", + "\n", + "dataset_dicts = get_balloon_dicts(\"balloon/val\")\n", + "for d in random.sample(dataset_dicts, 3):\n", + " im = cv2.imread(d[\"file_name\"])\n", + " outputs = predictor(\n", + " im\n", + " ) # format is documented at https://detectron2.readthedocs.io/tutorials/models.html#model-output-format\n", + " v = Visualizer(\n", + " im[:, :, ::-1],\n", + " metadata=balloon_metadata,\n", + " scale=0.5,\n", + " instance_mode=ColorMode.IMAGE_BW, # remove the colors of unsegmented pixels. This option is only available for segmentation models\n", + " )\n", + " out = v.draw_instance_predictions(outputs[\"instances\"].to(\"cpu\"))\n", + " cv2_imshow(out.get_image()[:, :, ::-1])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kblA1IyFvWbT" + }, + "source": [ + "We can also evaluate its performance using AP metric implemented in COCO API.\n", + "This gives an AP of ~70. Not bad!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "h9tECBQCvMv3" + }, + "outputs": [], + "source": [ + "from detectron2.evaluation import COCOEvaluator, inference_on_dataset\n", + "from detectron2.data import build_detection_test_loader\n", + "\n", + "evaluator = COCOEvaluator(\"balloon_val\", output_dir=\"./output\")\n", + "val_loader = build_detection_test_loader(cfg, \"balloon_val\")\n", + "print(inference_on_dataset(predictor.model, val_loader, evaluator))\n", + "# another equivalent way to evaluate the model is to use `trainer.test`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "XgisdpqDYiFv" + }, + "outputs": [], + "source": [ + "# Make sure that everything is logged to comet\n", + "comet_ml.end()" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "gpuType": "T4", + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.1" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}