Data Science Asked by Linear Algebra fans on September 4, 2021
Given a training sentences as follow form document:
… lemon, a tablespoon of apricot jam a pinch …
Word apricot choose to be target word as t with windows size 2
Training sample with both negative and positive samples looks like as follow
Positive samples:
apricot tablespoon
apricot of
apricot preserves
apricot or
Negative samples: (1 positive sample with 2 corresponding negative samples)
apricot aardvark apricot twelve
apricot puddle apricot hello
apricot where apricot dear
apricot coaxial apricot forever
The likelihood function (single word):
$$logfrac{1}{1+e^{-ccdot t}}+sum_{i=1}^k logfrac{1}{1+e^{n_icdot t}}$$
1.K is 2 since we have 2 negative sample for each positive sample
2.t is a vector of words apricot
3.c is a vector of words within the windows size such as apricot tablespoon in positive sample
4.$n_i$ is a vector of words in negative sample for each positive sample
Questions:
Here ‘s my questions:
1. How to fit the negative and positive sample into vector $c$ vector $n_i$ and vector $t$?
In deep learning version it is one hot encoding
but how about in this version?
2. Any workable example with small datasets?
3. How do I know my training result of vector t is correct?
Since I prefer to study this method with only a very small datasets and this method need a lot of training sample and training time like a week
However my aim is to learn about this method not for the word embedding
It will be great for anyone offer help to my question I am not only asking for help but also share what I learn
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP