Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added definition for a 10 hidden layer ANN for Ablation tests #30

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

amangupta2
Copy link
Contributor

#27

Added a new model definition to the code to test ANN performance with 10 hidden layers. I have just added an ablation flag to the training file. Would you suggest having a separate file for these ablation studies instead? Also, there is a possibility of making a more general neural net function definition which asks how many hidden layers we want in the model. Not sure if I want to work on that right now. Would be happy to look into it in the future.

Background:
Current ANNs have 6 hidden layers, which I think are not too low. But, it is good to conduct some sensitivity tests around this.

Having too many layers is also not ideal due to vanishing gradients, so I have selected 10 layers (which is how many Qiang has I think)

Copy link
Collaborator

@TomMelt TomMelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps one silly question... but what is ablation?

These changes look fine to me. But in general, for the whole code base, we should really be refactoring the code.

For example, each of the folders have duplicate files e.g., function_training.py, model_definition.py etc. These should be combined into a common set of utils. Then we can reduce code duplication.

I would also like to introduce docstrings, at least on new additions to the code.

I quite like Google's style (e.g., see here).

Could you please add some docstrings to the new functions in this PR?

Going forwards we can add it to all new changes.

As for the code, just a few minor changes.

Comment on lines +416 to +446
self.layer2 = nn.Linear(hdim, hdim)
self.act2 = nn.LeakyReLU()
# self.bnorm2 = nn.BatchNorm1d(hdim)
# -------------------------------------------------------
self.layer3 = nn.Linear(hdim, hdim)
self.act3 = nn.LeakyReLU()
# self.bnorm3 = nn.BatchNorm1d(hdim)
# -------------------------------------------------------
self.layer4 = nn.Linear(hdim, hdim)
self.act4 = nn.LeakyReLU()
# self.bnorm4 = nn.BatchNorm1d(2 * hdim)
# --------------------------------------------------------
self.layer5 = nn.Linear(hdim, hdim)
self.act5 = nn.LeakyReLU()
# self.bnorm5 = nn.BatchNorm1d(hdim)
# -------------------------------------------------------

self.layer6 = nn.Linear(hdim, hdim)
self.act6 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer7 = nn.Linear(hdim, hdim)
self.act7 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer8 = nn.Linear(hdim, hdim)
self.act8 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer9 = nn.Linear(hdim, hdim)
self.act9 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer10 = nn.Linear(hdim, 2 * odim)
self.act10 = nn.LeakyReLU()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should gather this into a list and loop over the layers and activation functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see here for example:

https://github.com/DataWaveProject/CAM_GW_pytorch_emulator/blob/fd780f5b23fbdbe83d0bf7b33f3ff5c3e216fede/newCAM_emulation/Model.py#L61-L68

        layers = []
        input_size = in_ver * ilev + in_nover  
        for _ in range(hidden_layers):
            layers.append(nn.Linear(input_size, hidden_size, dtype=torch.float64))
            layers.append(nn.SiLU())
            input_size = hidden_size
        layers.append(nn.Linear(hidden_size, out_ver * ilev, dtype=torch.float64))
        self.linear_stack = nn.Sequential(*layers)

Comment on lines +458 to +467
x = self.dropout(self.act1(self.layer1(x)))
x = self.dropout(self.act2(self.layer2(x)))
x = self.dropout(self.act3(self.layer3(x)))
x = self.dropout(self.act4(self.layer4(x)))
x = self.dropout(self.act5(self.layer5(x)))
x = self.dropout(self.act6(self.layer6(x)))
x = self.dropout(self.act7(self.layer7(x)))
x = self.dropout(self.act8(self.layer8(x)))
x = self.dropout(self.act9(self.layer9(x)))
x = self.dropout(self.act10(self.layer10(x)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here:

We should gather this into a list and loop over the layers and activation functions

@@ -24,7 +24,7 @@
import pandas as pd

from dataloader_definition import Dataset_ANN_CNN
from model_definition import ANN_CNN
from model_definition import ANN_CNN, ANN_CNN10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the refactor we should perhaps make number of layers a parameter rather than a new class?

@TomMelt TomMelt added the enhancement New feature or request label Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants