Cross Validated Asked by Ken Geonmin Kim on November 26, 2021
Consider Autoregressive model (i.e. RNN Language model) which try to output next token given all previous tokens. When generating sequence with this model, model need to learn when should be end of sequence.
In case of discrete sequence, model is usually trained to maximize softmax output distribution with (= End of sequence) token when sequence reaches at the end.
For continuous sequence case (such as speech), which technique can we use to teach model learning where end of sequence is?
For continuous sequence case (such as speech), which technique can we use to teach model learning where end of sequence is?
We can see from the definition of autoregressive model:
$$X_t=c+sum_{i=1}^p varphi X_{t-i} + epsilon_t$$
that $X_n$ depends linearly on its own previous values if n is the last step and the next step is to generate an end of sequence token.
For instance, in WaveNet the tokens are quantization channels and at the end of the output layer, there is also a softmax over the channels. If it should terminate the channel that represents silence would gain the biggest probability.
Reference
Answered by Lerner Zhang on November 26, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP