Some question about CPG_cifar100_scratch_mul_1.5.sh #2

rmp918 · 2020-08-07T16:14:15Z

Thanks for this great work
I have two questions about CPG_cifar100_scratch_mul_1.5.sh
I run this .sh to complete experiment1 step3 to achieve my adversarial training project.
When I run the first task and prune it and choose a ratio.
I got this error:

I can't figure out why it can't pass the network_width_multiplier to next task on VGG network

Second question is about max_allowed_network_width_multiplier

CPG/experiment1/CPG_cifar100_scratch_mul_1.5.sh

Line 32 in 5acb7d1

max_allowed_network_width_multiplier=1.5

and network_width_multiplier

CPG/experiment1/CPG_cifar100_scratch_mul_1.5.sh

Line 36 in 5acb7d1

network_width_multiplier=1.0

Do they have any related or something special ?

Hope someone can help me solve this problem, thanks.

The text was updated successfully, but these errors were encountered:

andytu28 · 2020-08-10T03:54:38Z

@rmp918
For the first question, I am not sure quite what causes the error because it works well in my setting (I just run the code again with PyTorch 1.4.0).
After training and pruning the 1st task, it should save the pruned model under checkpoints/CPG/experiment1/scratch_mul_1.5/custom_vgg_cifar100/aquatic_mammals/gradual_prune, and the saved model will be loaded before training the 2nd task.
The checkpoint is saved by the function save_checkpoint(...) defined in utils/manager.py at Line 198, and the key network_width_multiplier is saved there in the dictionary shared_layer_info as well.
I think normally it should work, but because the KeyError occurs in your setting, maybe you can check the function to see if your checkpoint is saved correctly? Hope this can help you figure out the problem.

For the second question, max_allowed_network_width_multiplier means the maximum width multiplier allowed for all the 20 tasks, and network_width_multiplier means the width multiplier for the current task.
At Line 36, network_width_multiplier is initialized to 1.0, but you can see that it gradually increases (if needed) at Line 92 of the script.

rmp918 · 2020-08-10T05:19:14Z

Thank for your answer.
So, that's mean if network_width_multiplier is over max_allowed_network_width_multiplier ,the program will be terminated?

And how does this program save and replace file in the checkpoints/CPG/experiment1/scratch_mul_1.5/custom_vgg_cifar100/xxx/gradual_prune
In my program, it need expand 3 times then get in the prune mode, but it can't save and replace file then make next task finetune mode report error and fail.
So can you point me what is the key point about this situation?

Thanks for your help~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some question about CPG_cifar100_scratch_mul_1.5.sh #2

Some question about CPG_cifar100_scratch_mul_1.5.sh #2

rmp918 commented Aug 7, 2020 •

edited

Loading

andytu28 commented Aug 10, 2020

rmp918 commented Aug 10, 2020 •

edited

Loading

Some question about CPG_cifar100_scratch_mul_1.5.sh #2

Some question about CPG_cifar100_scratch_mul_1.5.sh #2

Comments

rmp918 commented Aug 7, 2020 • edited Loading

andytu28 commented Aug 10, 2020

rmp918 commented Aug 10, 2020 • edited Loading

rmp918 commented Aug 7, 2020 •

edited

Loading

rmp918 commented Aug 10, 2020 •

edited

Loading