Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于梯度计算的疑惑 #2

Open
savannahfan opened this issue Apr 14, 2022 · 0 comments
Open

关于梯度计算的疑惑 #2

savannahfan opened this issue Apr 14, 2022 · 0 comments

Comments

@savannahfan
Copy link

作者你好,感谢你提供的bert meta learning的实现代码。

我在阅读reptile这部分代码的时候有一个问题:在reptile.py计算梯度的部分(第90行) gradient = meta_params - fast_params,为什么不是用更新后的模型参数减去更新前参数呢?我试着把他们的位置调换,发现对模型效果影响并不大,但是按照相反方向更新模型应该会训不动的。

另外,为什么outer learning rate 要设一个这么小的值呢(5e-5),本来inner learning rate已经很小了,如果一个小的outer learning rate 乘以平均后的梯度,那么更新的程度不会很小吗?

期待回复,谢谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant