How to learn 'end of sequence' for continuous sequence?

Question

Consider Autoregressive model (i.e. RNN Language model) which try to output next token given all previous tokens. When generating sequence with this model, model need to learn when should be end of sequence.

In case of discrete sequence, model is usually trained to maximize softmax output distribution with  (= End of sequence) token when sequence reaches at the end.

For continuous sequence case (such as speech), which technique can we use to teach model learning where end of sequence is?

Lerner Zhang · Answer

For continuous sequence case (such as speech), which technique can we
  use to teach model learning where end of sequence is?

We can see from the definition of autoregressive model:
$$X_t=c+sum_{i=1}^p varphi X_{t-i} + epsilon_t$$

that $X_n$ depends linearly on its own previous values if n is the last step and the next step is to generate an end of sequence token.

For instance, in WaveNet the tokens are quantization channels and at the end of the output layer, there is also a softmax over the channels. If it should terminate the channel that represents silence would gain the biggest probability.

Reference

A TensorFlow implementation of DeepMind's WaveNet paper

How to learn 'end of sequence' for continuous sequence?

One Answer

Add your own answers!

Ask a Question