-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dropout Cause significant performance change between each trainning #24
Comments
This is a very good point! It uses three samples for each different hyper-parameter set, in order to average the final performance. One idea to combat overfitting without causing variation in the final performance is using Batch Normalization instead of Dropout. I should try it and see if it's better. Do you have any idea on that? |
i am using both batch norm and dropout on my custom dataset. training images = 226 model trains with 1356 images and validates on 40 images. however, it generates 0.1 val score on each epoch. is this normal? This is my model:
|
well, in my term, I will use sgd instead of adaptive-opt, cause sgd tend to converged on a flat minimum which shows better generalization. Sgd is much slow than adaptive-opt so that I will change learning rate schedule to cosine-cyclical-learning rate, thus will be more steady outcome. Because we only seek relative better hyper-params not "best" param. |
Using Dropout in child_model shows great works on prevent overfitting, however it also cause the final performance on model change significantly during each training with same hyper-params. It is too random that cause that we need using more sampling times to estimate final performance on one hyper-params which is very time consuming. Any ideal for solving this problem.
The text was updated successfully, but these errors were encountered: