TransWikia.com

Explanation about i//2 in positional encoding in tensorflow tutorial about transformers

Data Science Asked by Darome on August 13, 2020

I was implementing the transformer architecture in tensorflow.

I was following the tutorial : https://www.tensorflow.org/tutorials/text/transformer#setup_input_pipeline

They implement the positional encoding in this way:

angle_rates = 1 / np.power(10000, (2 * (i//2)) / np.float32(d_model))

However in the paper i is not divided by 2 (i//2), is this a bug? , or why is the reason to make this operation?

enter image description here

thanks

One Answer

It's not a bug, although they added some confusion with this trick. They should better call their argument $j$ instead of $i$, cos what they actually do is they take all values $0 leq j leq d_{model} - 1$ and compute $PE(pos, j)$. $j$ сan be either even or odd, but in the right side of the equation it even, that's why they compute i//2 and multiply back by 2.

Answered by Michael Solotky on August 13, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP