Added definition for a 10 hidden layer ANN for Ablation tests #30

amangupta2 · 2024-12-11T17:17:52Z

Added a new model definition to the code to test ANN performance with 10 hidden layers. I have just added an ablation flag to the training file. Would you suggest having a separate file for these ablation studies instead? Also, there is a possibility of making a more general neural net function definition which asks how many hidden layers we want in the model. Not sure if I want to work on that right now. Would be happy to look into it in the future.

Background:
Current ANNs have 6 hidden layers, which I think are not too low. But, it is good to conduct some sensitivity tests around this.

Having too many layers is also not ideal due to vanishing gradients, so I have selected 10 layers (which is how many Qiang has I think)

TomMelt

Perhaps one silly question... but what is ablation?

These changes look fine to me. But in general, for the whole code base, we should really be refactoring the code.

For example, each of the folders have duplicate files e.g., function_training.py, model_definition.py etc. These should be combined into a common set of utils. Then we can reduce code duplication.

I would also like to introduce docstrings, at least on new additions to the code.

I quite like Google's style (e.g., see here).

Could you please add some docstrings to the new functions in this PR?

Going forwards we can add it to all new changes.

As for the code, just a few minor changes.

TomMelt · 2024-12-17T10:23:53Z

ann_cnn_training/model_definition.py

+        self.layer2 = nn.Linear(hdim, hdim)
+        self.act2 = nn.LeakyReLU()
+        # self.bnorm2 = nn.BatchNorm1d(hdim)
+        # -------------------------------------------------------
+        self.layer3 = nn.Linear(hdim, hdim)
+        self.act3 = nn.LeakyReLU()
+        # self.bnorm3 = nn.BatchNorm1d(hdim)
+        # -------------------------------------------------------
+        self.layer4 = nn.Linear(hdim, hdim)
+        self.act4 = nn.LeakyReLU()
+        # self.bnorm4 = nn.BatchNorm1d(2 * hdim)
+        # --------------------------------------------------------
+        self.layer5 = nn.Linear(hdim, hdim)
+        self.act5 = nn.LeakyReLU()
+        # self.bnorm5 = nn.BatchNorm1d(hdim)
+        # -------------------------------------------------------
+
+        self.layer6 = nn.Linear(hdim, hdim)
+        self.act6 = nn.LeakyReLU()
+        # -------------------------------------------------------
+        self.layer7 = nn.Linear(hdim, hdim)
+        self.act7 = nn.LeakyReLU()
+        # -------------------------------------------------------
+        self.layer8 = nn.Linear(hdim, hdim)
+        self.act8 = nn.LeakyReLU()
+        # -------------------------------------------------------
+        self.layer9 = nn.Linear(hdim, hdim)
+        self.act9 = nn.LeakyReLU()
+        # -------------------------------------------------------
+        self.layer10 = nn.Linear(hdim, 2 * odim)
+        self.act10 = nn.LeakyReLU()


We should gather this into a list and loop over the layers and activation functions

see here for example:

https://github.com/DataWaveProject/CAM_GW_pytorch_emulator/blob/fd780f5b23fbdbe83d0bf7b33f3ff5c3e216fede/newCAM_emulation/Model.py#L61-L68

layers = [] input_size = in_ver * ilev + in_nover for _ in range(hidden_layers): layers.append(nn.Linear(input_size, hidden_size, dtype=torch.float64)) layers.append(nn.SiLU()) input_size = hidden_size layers.append(nn.Linear(hidden_size, out_ver * ilev, dtype=torch.float64)) self.linear_stack = nn.Sequential(*layers)

TomMelt · 2024-12-17T10:24:07Z

ann_cnn_training/model_definition.py

+        x = self.dropout(self.act1(self.layer1(x)))
+        x = self.dropout(self.act2(self.layer2(x)))
+        x = self.dropout(self.act3(self.layer3(x)))
+        x = self.dropout(self.act4(self.layer4(x)))
+        x = self.dropout(self.act5(self.layer5(x)))
+        x = self.dropout(self.act6(self.layer6(x)))
+        x = self.dropout(self.act7(self.layer7(x)))
+        x = self.dropout(self.act8(self.layer8(x)))
+        x = self.dropout(self.act9(self.layer9(x)))
+        x = self.dropout(self.act10(self.layer10(x)))


similarly here:

We should gather this into a list and loop over the layers and activation functions

TomMelt · 2024-12-17T10:24:47Z

ann_cnn_training/training.py

@@ -24,7 +24,7 @@
 import pandas as pd

 from dataloader_definition import Dataset_ANN_CNN
-from model_definition import ANN_CNN
+from model_definition import ANN_CNN, ANN_CNN10


As part of the refactor we should perhaps make number of layers a parameter rather than a new class?

Added definition for a 10 hidden layer ANN for Ablation tests

f6d8dc0

amangupta2 requested review from omarjamil, TomMelt and j-emberton December 11, 2024 17:17

TomMelt requested changes Dec 17, 2024

View reviewed changes

TomMelt added the enhancement New feature or request label Dec 17, 2024

amangupta2 mentioned this pull request Jan 23, 2025

Refactor ann and unet #37

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added definition for a 10 hidden layer ANN for Ablation tests #30

Added definition for a 10 hidden layer ANN for Ablation tests #30

amangupta2 commented Dec 11, 2024

TomMelt left a comment

TomMelt Dec 17, 2024

TomMelt Jan 16, 2025

TomMelt Dec 17, 2024

TomMelt Dec 17, 2024

Added definition for a 10 hidden layer ANN for Ablation tests #30

Are you sure you want to change the base?

Added definition for a 10 hidden layer ANN for Ablation tests #30

Conversation

amangupta2 commented Dec 11, 2024

TomMelt left a comment

Choose a reason for hiding this comment

TomMelt Dec 17, 2024

Choose a reason for hiding this comment

TomMelt Jan 16, 2025

Choose a reason for hiding this comment

TomMelt Dec 17, 2024

Choose a reason for hiding this comment

TomMelt Dec 17, 2024

Choose a reason for hiding this comment