Real-time bounding box regression based on ResNet18 using LibTorch. OpenCV and LibTorch are required dependencies. In addition, optionally TorchVision can be used if available. In case it is not available, ResNet model from model/
folder is used. This model is taken from TorchVision repository. The marker detector is robust to poor lighting conditions, as can be seen in the following figure:
An example of the target is shown below.
In the images/
folder, there is a PNG and SVG version of the marker.
- Clone this repository:
git clone https://github.com/jhacsonmeza/CNN-MarkerDetect.git
cd CNN-MarkerDetect
mkdir build && cd build
- If you have TorchVision available run:
cmake -DCMAKE_PREFIX_PATH="path/to/LibTorch;path/to/TorchVision" ..
, otherwise run:cmake -DCMAKE_PREFIX_PATH=path/to/LibTorch ..
cmake --build .
For training download ResNet18 pretrained weights here into the CNN-MarkerDetect
folder. Then, into the CNN-MarkerDetect/build/
folder run: ./train
. During training for image data augmentation, the following operations are performed randomly: vertical and horizontal flip, translation, scaling, and brightness modifications. Furthermore, intersection over union (IoU) is used as an accuracy metric during training.
Download the weights of the model here into the cloned CNN-MarkerDetect
folder. Also, and for a fast test, download the target image in your mobile device, which is in the images/
folder. Then, in the CNN-MarkerDetect/build/
location run ./realtime
. This snippet of code will use the webcam of your computer for target detection. With ESC
you can stop the video acquisition. The above is an example running in CPU.