Data Science Asked by Shirish Kulhari on April 9, 2021
I’m reading about Word2Vec from this source: http://jalammar.github.io/illustrated-word2vec/. Below is the heatmap of the embeddings for various words. In the source, it’s claimed that we can get an idea on what the different dimensions "mean" (their interpretation) based on their values for different words. For example, there’s a column that’s dark blue for every word except WATER, so that dimension may have something to do with the word representing a person.
Secondly, there’s a famous example that "king" – "man" + "woman" ~= "queen", where the word in quotation means the embedding of that word.
My questions are:
This works because the way that the neural network ended up learning about related frequencies of terms ended up getting encoded into the W2V matrix. Analogous relationships like the differences in relative occurrences of Man and Woman end up matching the relative occurrences of King and Queen in certain ways that the W2V captures.
This seems like a broad, vague kind of an explanation. Is there any online resource or paper that explains (or better yet, proves) why this property of embedding vectors should hold?
The dimensions for an embedding space are only be accidentally interpretable.
However, vectors through the space can be interpretable. That is why word analogies are possible in an embedding space . The addition / subtraction of word vectors describe another vector through the embedding space. For example, "king" - "man" + "woman" approximate the "queen" vector.
Words are consistently used in relationship to other words. Word embeddings can model these consistent relationship by finding which words co-occur together and projection the words into lower dimension space that retains the most common occurrences.
Answered by Brian Spiering on April 9, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP