Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some question about CPG_cifar100_scratch_mul_1.5.sh #2

Open
rmp918 opened this issue Aug 7, 2020 · 2 comments
Open

Some question about CPG_cifar100_scratch_mul_1.5.sh #2

rmp918 opened this issue Aug 7, 2020 · 2 comments

Comments

@rmp918
Copy link

rmp918 commented Aug 7, 2020

Thanks for this great work
I have two questions about CPG_cifar100_scratch_mul_1.5.sh
I run this .sh to complete experiment1 step3 to achieve my adversarial training project.
When I run the first task and prune it and choose a ratio.
I got this error:
圖片

I can't figure out why it can't pass the network_width_multiplier to next task on VGG network

Second question is about max_allowed_network_width_multiplier

max_allowed_network_width_multiplier=1.5

and network_width_multiplier
network_width_multiplier=1.0

Do they have any related or something special ?

Hope someone can help me solve this problem, thanks.

@andytu28
Copy link
Contributor

@rmp918
For the first question, I am not sure quite what causes the error because it works well in my setting (I just run the code again with PyTorch 1.4.0).
After training and pruning the 1st task, it should save the pruned model under checkpoints/CPG/experiment1/scratch_mul_1.5/custom_vgg_cifar100/aquatic_mammals/gradual_prune, and the saved model will be loaded before training the 2nd task.
The checkpoint is saved by the function save_checkpoint(...) defined in utils/manager.py at Line 198, and the key network_width_multiplier is saved there in the dictionary shared_layer_info as well.
I think normally it should work, but because the KeyError occurs in your setting, maybe you can check the function to see if your checkpoint is saved correctly? Hope this can help you figure out the problem.

For the second question, max_allowed_network_width_multiplier means the maximum width multiplier allowed for all the 20 tasks, and network_width_multiplier means the width multiplier for the current task.
At Line 36, network_width_multiplier is initialized to 1.0, but you can see that it gradually increases (if needed) at Line 92 of the script.

@rmp918
Copy link
Author

rmp918 commented Aug 10, 2020

Thank for your answer.
So, that's mean if network_width_multiplier is over max_allowed_network_width_multiplier ,the program will be terminated?

And how does this program save and replace file in the checkpoints/CPG/experiment1/scratch_mul_1.5/custom_vgg_cifar100/xxx/gradual_prune
In my program, it need expand 3 times then get in the prune mode, but it can't save and replace file then make next task finetune mode report error and fail.
So can you point me what is the key point about this situation?

Thanks for your help~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants