Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimizer.zero_grad() before loss.backward()? #19

Open
ale94mleon opened this issue Nov 8, 2022 · 1 comment
Open

optimizer.zero_grad() before loss.backward()? #19

ale94mleon opened this issue Nov 8, 2022 · 1 comment

Comments

@ale94mleon
Copy link

Hi, Excellent tutorials! But I have a question. Form tutorial 13 and on you change the place where the zero_grad method is called and I do not get why?
Before 13 was:

loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
optimizer.zero_grad()

After 13:

loss = criterion(outputs, labels)
optimizer.zero_grad() # Here is the change
loss.backward()
optimizer.step()

Now I am wondering if you set to zero the gradients, then, how the optimizer could update the parameters without any information about the gradient?

@saeedahmadicp
Copy link

When we start your training loop, ideally we should zero out the gradients so that we do the parameter update correctly. Otherwise, the gradient would be a combination of the old gradient, which we have already used to update our model parameters and the newly-computed gradient. It would therefore point in some other direction than the intended direction towards the minimum (or maximum, in case of maximization objectives).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants