-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added definition for a 10 hidden layer ANN for Ablation tests #30
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps one silly question... but what is ablation?
These changes look fine to me. But in general, for the whole code base, we should really be refactoring the code.
For example, each of the folders have duplicate files e.g., function_training.py
, model_definition.py
etc. These should be combined into a common set of utils. Then we can reduce code duplication.
I would also like to introduce docstrings, at least on new additions to the code.
I quite like Google's style (e.g., see here).
Could you please add some docstrings to the new functions in this PR?
Going forwards we can add it to all new changes.
As for the code, just a few minor changes.
self.layer2 = nn.Linear(hdim, hdim) | ||
self.act2 = nn.LeakyReLU() | ||
# self.bnorm2 = nn.BatchNorm1d(hdim) | ||
# ------------------------------------------------------- | ||
self.layer3 = nn.Linear(hdim, hdim) | ||
self.act3 = nn.LeakyReLU() | ||
# self.bnorm3 = nn.BatchNorm1d(hdim) | ||
# ------------------------------------------------------- | ||
self.layer4 = nn.Linear(hdim, hdim) | ||
self.act4 = nn.LeakyReLU() | ||
# self.bnorm4 = nn.BatchNorm1d(2 * hdim) | ||
# -------------------------------------------------------- | ||
self.layer5 = nn.Linear(hdim, hdim) | ||
self.act5 = nn.LeakyReLU() | ||
# self.bnorm5 = nn.BatchNorm1d(hdim) | ||
# ------------------------------------------------------- | ||
|
||
self.layer6 = nn.Linear(hdim, hdim) | ||
self.act6 = nn.LeakyReLU() | ||
# ------------------------------------------------------- | ||
self.layer7 = nn.Linear(hdim, hdim) | ||
self.act7 = nn.LeakyReLU() | ||
# ------------------------------------------------------- | ||
self.layer8 = nn.Linear(hdim, hdim) | ||
self.act8 = nn.LeakyReLU() | ||
# ------------------------------------------------------- | ||
self.layer9 = nn.Linear(hdim, hdim) | ||
self.act9 = nn.LeakyReLU() | ||
# ------------------------------------------------------- | ||
self.layer10 = nn.Linear(hdim, 2 * odim) | ||
self.act10 = nn.LeakyReLU() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should gather this into a list
and loop over the layers and activation functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see here for example:
layers = []
input_size = in_ver * ilev + in_nover
for _ in range(hidden_layers):
layers.append(nn.Linear(input_size, hidden_size, dtype=torch.float64))
layers.append(nn.SiLU())
input_size = hidden_size
layers.append(nn.Linear(hidden_size, out_ver * ilev, dtype=torch.float64))
self.linear_stack = nn.Sequential(*layers)
x = self.dropout(self.act1(self.layer1(x))) | ||
x = self.dropout(self.act2(self.layer2(x))) | ||
x = self.dropout(self.act3(self.layer3(x))) | ||
x = self.dropout(self.act4(self.layer4(x))) | ||
x = self.dropout(self.act5(self.layer5(x))) | ||
x = self.dropout(self.act6(self.layer6(x))) | ||
x = self.dropout(self.act7(self.layer7(x))) | ||
x = self.dropout(self.act8(self.layer8(x))) | ||
x = self.dropout(self.act9(self.layer9(x))) | ||
x = self.dropout(self.act10(self.layer10(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly here:
We should gather this into a list
and loop over the layers and activation functions
@@ -24,7 +24,7 @@ | |||
import pandas as pd | |||
|
|||
from dataloader_definition import Dataset_ANN_CNN | |||
from model_definition import ANN_CNN | |||
from model_definition import ANN_CNN, ANN_CNN10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of the refactor we should perhaps make number of layers a parameter rather than a new class?
#27
Added a new model definition to the code to test ANN performance with 10 hidden layers. I have just added an ablation flag to the training file. Would you suggest having a separate file for these ablation studies instead? Also, there is a possibility of making a more general neural net function definition which asks how many hidden layers we want in the model. Not sure if I want to work on that right now. Would be happy to look into it in the future.
Background:
Current ANNs have 6 hidden layers, which I think are not too low. But, it is good to conduct some sensitivity tests around this.
Having too many layers is also not ideal due to vanishing gradients, so I have selected 10 layers (which is how many Qiang has I think)