Replies: 2 comments
-
Hi @Brainkite |
Beta Was this translation helpful? Give feedback.
0 replies
-
Interesting. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm quite new to MMsegm and it's a great library.
I did some WandB hyper-parameters optimization with various models on the demo Colab notebook and strangely reached some very similar best val Dice score for FCN_R101, OCR_HRNet_W48 and DeepLabV3+.
I mostly explored Optimizers (sgd, adam), schedules (poly,1-cycle), lr, mom & wd. And always started from the Cityscapes 80K pre-trained models.
Here are the best val/Dice score they reached and with what parameters:
OCR_HRNet_W48:
best Dice: 0.8683
bs: 16
max_iters: 2000
lr: 0.02765
mom: 0.8
optim: sgd
regime: poly
wd: 0.00001
FCN_R101b:
best Dice: 0.8631
bs:16
max_iters: 2000
Lr: 0.01658
mom: 0.8113
optim: sgd
regime: poly
wd: 0.00001
DeepLabV3+_R101:
best Dice: 0.8616
bs: 16
iters: 2000
lr: 0.0001
optim: adam
regime: cyclic
wd: 0.000001
Can this be because of the "small" size of the Stanford dataset or because of the "short" training of 2000 iters?
Is it because I started from very fine-tuned models on the Cityscapes dataset and would have obtained quite different scores training from pre-trained backbones and random init heads?
Beta Was this translation helpful? Give feedback.
All reactions