optimizer.zero_grad() before loss.backward()? #19

ale94mleon · 2022-11-08T09:36:09Z

Hi, Excellent tutorials! But I have a question. Form tutorial 13 and on you change the place where the zero_grad method is called and I do not get why?
Before 13 was:

loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
optimizer.zero_grad()

After 13:

loss = criterion(outputs, labels)
optimizer.zero_grad() # Here is the change
loss.backward()
optimizer.step()

Now I am wondering if you set to zero the gradients, then, how the optimizer could update the parameters without any information about the gradient?

The text was updated successfully, but these errors were encountered:

saeedahmadicp · 2022-11-13T03:28:16Z

When we start your training loop, ideally we should zero out the gradients so that we do the parameter update correctly. Otherwise, the gradient would be a combination of the old gradient, which we have already used to update our model parameters and the newly-computed gradient. It would therefore point in some other direction than the intended direction towards the minimum (or maximum, in case of maximization objectives).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizer.zero_grad() before loss.backward()? #19

optimizer.zero_grad() before loss.backward()? #19

ale94mleon commented Nov 8, 2022

saeedahmadicp commented Nov 13, 2022

optimizer.zero_grad() before loss.backward()? #19

optimizer.zero_grad() before loss.backward()? #19

Comments

ale94mleon commented Nov 8, 2022

saeedahmadicp commented Nov 13, 2022