Data Science Asked by user3363813 on May 8, 2021
I want to calculate the gradient and use the same gradient to minimize one part and maximize another part of the same network (kind of adversarial case). For me, Ideal case would be, if there are two optimizers responsible for two part of the network/model and one of the optimizers has a negative learning rate. But it seems that PyTorch does not allow negative learning rate.
In this case what I am doing is:
loss.backward()
optimzer_for_one_part of the model.step()
and then
(-loss).backward()
Problem is, This time the again calculated gradient will not be the same(values are different but flopped of course) because some weights of the same network (same computation graph) have already been changed. But, Ideally, I want to use the flipped version of the previous gradient.
How can I achieve this?
The trick you are looking for is called the Gradient Reversal Layer. It is a layer that does nothing (i.e., identity) in the forward pass, but it reverts the sign of the gradient, so everything behind the layer optimizes the opposite of the loss function.
There are several PyTorch implementations:
Initially, it was introduced for unsupervised domain dataptaion. Now it has quite a lot of applications, such as removing sensitive information from CV representation or removing language identity from multilingual contextual embeddings.
Correct answer by Jindřich on May 8, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP