Artificial Intelligence Asked by thepacker on January 5, 2022

Let’s assume that we embedded a vector of length 49 into a matrix using 512-d embeddings. If we then multiply the matrix by its transposed version, we receive a matrix of 49 by 49, which is symmetric. Let’s also assume we do not add the positional encoding and we only have only one attention head in the first layer of the transformer architecture.

What would the result of the softmax on this 49 by 49 matrix look like? Is it still symmetric, or is the softmax correctly applied for each line of the matrix, resulting in a non-symmetric matrix? My guess would be that the matrix should not be symmetric anymore. But I’m unsure about that.

I ask this to verify if my implementation is wrong or not, and what the output should look like. I have seen so many sophisticated and different implementations of the transformer architecture with different frameworks, that I can’t answer this question for myself right now (confusion). I still try to understand the basic building blocks of the transformer architecture.

I compared my results visually to a second implementation known to be working - "The annotated transformer". I compared the pytorch calculation results of the attention-method to my implementation results.

The answe is - the softmax is applied row by row. Therefore the resulting matrix p-attn is not equal to its transposed version.

Answered by thepacker on January 5, 2022

Get help from others!

Recent Answers

- Joshua Engel on Why fry rice before boiling?
- Peter Machado on Why fry rice before boiling?
- Jon Church on Why fry rice before boiling?
- Lex on Does Google Analytics track 404 page responses as valid page views?
- haakon.io on Why fry rice before boiling?

Recent Questions

- How can I transform graph image into a tikzpicture LaTeX code?
- How Do I Get The Ifruit App Off Of Gta 5 / Grand Theft Auto 5
- Iv’e designed a space elevator using a series of lasers. do you know anybody i could submit the designs too that could manufacture the concept and put it to use
- Need help finding a book. Female OP protagonist, magic
- Why is the WWF pending games (“Your turn”) area replaced w/ a column of “Bonus & Reward”gift boxes?

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP