Data Science Asked by Wickkiey on March 27, 2021
Hi I am trying to understand the NN with pytorch.
I have doubts in gradient calculations..
import torch.optim as optim
create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)
# in your training loop:
optimizer.zero_grad() # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step() # Does the update
From the about code, I understood loss.backward() calculates the gradients.
I am not sure, how these info shared with optimizer
to update the gradient.
Can anyone explain this..
Thanks in advance !
Recall that you passed net.parameters()
to the optimizer, so it has access to the "Tensor" objects, as well as their associated data. One of the associated data fields associated to each learnable tensor parameter is a gradient buffer. Hence, backward()
not only computes the gradients, but stores them in each parameter tensor, so that the gradient vector per parameter is stored along with that parameter. In other words, for some parameter $theta_i$, backward()
stores $ partial mathcal{L}(Theta)/partial theta_i$ along with that parameter. The optimizer.step()
call then simply updates each parameter via the gradient stored along with it.
Answered by user3658307 on March 27, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP