Cross Validated Asked on December 5, 2021
From the paper that introduced attention mechanisms (Bahdanau et al 2014: Neural Machine Translation by Jointly Learning to Align and Translate), it seems that the translating part is the regular RNN/LSTM encoder-decoder and the aligning part is the actual attention mechanism (another smaller MLP), used to align words in the input language sentence into the target sentence.
Is that interpretation correct? the so-called attention mechanism is the alignment model?
In that case, the attention mechanism is used to attend to certain input words in the source sentence during each iterative prediction of words for the target sentence?
Yes, this is the idea that the original paper promoted.
Note, however, that it is a little bit tricky to use the term alignment. For purposes of the statistical machine translation, it was defined as a meaning correspondence of the source and target words. A highly cited 2017 study shows that the attention might learn very unintuitive alignments which seem totally wrong without any loss of translation quality.
Answered by Jindřich on December 5, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP