Installation

HASEL (High-Speed Video Annotation Tool for Structured Light Endoscopy in the Human Larynx) is a Deep-Learning-supported tool for generating ground-truth data for High-Speed Video Structured Light Laryngoscopy. This tool enables to robustly and rapidly generate data of:

Glottal segmentation with different segmentation architectures
Vocal fold segmentation via frame-wise interpolation
Semi-automatic and Deep Learning enhanced generation of laserpoint data.

Installation

Please follow these instructions to make sure that Hasel runs as intended. In general, we recommend a current nvidia graphics on par with a quadro RTX 4000. First, create the environment and install necessary packages.

conda create --name VFLabel python=3.12
pip install torch torchvision torchaudio
conda install pyqt qtpy
pip install torchmetrics albumentations imageio kornia segmentation-models-pytorch matplotlib flow_vis tensorboard tqdm

Next, install Hasel for development:

git clone https://github.com/Henningson/VFLabel.git
cd VFLabel
python3 -m pip install -e .

Next, we need to create a model folder and install metas cotracker3. For this, you can also follow these instructions.

cd ..
git clone https://github.com/facebookresearch/co-tracker
cd co-tracker
pip install -e .

Next, we need to download the cotracker3 offline version.

cd ../VFLabel
mkdir assets/models
cd assets/models
wget https://huggingface.co/facebook/cotracker3/resolve/main/scaled_offline.pth
cd .../..

Finally, download the glottis segmentation networks from here and move them to assets/models/.

How to use Hasel

Make a video here.

Regarding the glottal segmentation

Glottal Segmentations can also easily be generated from command line arguments, via the supplied script in the examples:

python examples/scripts/segment_glottis --encoder mobilenet_v2 --image_folder PATH_TO_WHERE_THE_IMAGES_Are --save_folder OUTPUT_FOLDER

Available CNNs

We supply four(*) U-Nets with different backbones in this repository. They can be downloaded here. Make sure to extract the files into assets/models. Here is the evaluation of the models. Generally the resnet based backbones are the best performing, but the other backbones are generally better to use on cpu only systems. You should generally test out which ones work best for you. You can find how to use the supplied networks in examples/scripts. *: There's also a efficientnet available, but it generally performs worse than the rest. However, I'd advise you to also test it out on some data.

Evaluation of available U-Nets

We evaluated the Networks on a combined test-set of the BAGLS and HLE datasets, as well as synthetically created vocal folds using Fireflies.

Backbone	Eval IoU	Eval DICE	Test Dice	Test IoU
mobilenet-v2	0.864	0.927	0.893	0.807
mobilenetv3_large_100	0.845	0.916	0.789	0.65
resnet18	0.856	0.922	0.882	0.789
resnet34	0.846	0.917	0.883	0.791

Train your own glottal segmentation network on large datasets

To train your own network on a bunch of vocalfold datasets, download the HLE and BAGLS dataset, and put them into a common folder. Next also download the fireflies dataset from here and also extract it into the folder. The final folder structure should look like this:

dataset/
├── BAGLS/
├── HLEDataset/
└── fireflies_dataset_v5/

For training, please follow the code in the example script examples/scripts/train_glottis_segmentation_network.py. There, you will fine-tune the decoder of common segmentation model architectures that were pretrained on imagenet.

Name		Name	Last commit message	Last commit date
Latest commit History 161 Commits
PointClassificator		PointClassificator
VFLabel		VFLabel
assets		assets
examples		examples
.gitignore		.gitignore
CODING_GUIDELINES.md		CODING_GUIDELINES.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

How to use Hasel

Regarding the glottal segmentation

Available CNNs

Evaluation of available U-Nets

Train your own glottal segmentation network on large datasets

Data preparation

Examples

About

Releases

Packages

Contributors 2

Languages

License

Henningson/VFLabel

Folders and files

Latest commit

History

Repository files navigation

Installation

How to use Hasel

Regarding the glottal segmentation

Available CNNs

Evaluation of available U-Nets

Train your own glottal segmentation network on large datasets

Data preparation

Examples

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages