A semantic segmentation project for UT Digital Video Processing, using @comma.ai's comma10k dataset.
We updated the original comma10k-baseline to pytorch lightning 1.0, and used Trans2Seg implemented in a fork of the pytorch segmentation models repo. This transformer model with a ResNet backbone reaches 0.073 validation CCE loss.
- clone [my fork] of pytorch_segmentation.models
- navigate to the fork and
pip install -e .
- clone this repo
- navigate to this repo and install required packages
make create_environment
conda activate drive_segmentation
make requirements
This uses one stage of 512x512 images for training.
python3 train_lit_model.py --backbone efficientnet-b4 --version second-stage --gpus 2 --batch-size 7 --learning-rate 5e-5 --epochs 30 --height 512 --width 512 --augmentation-level hard
Python 3.5+, pytorch 1.6+ and dependencies listed in requirements.txt.