Data Science Asked by Jack Twain on July 18, 2021
I am learning about matrix factorization for recommender systems and I am seeing the term latent features
occurring too frequently but I am unable to understand what it means. I know what a feature is but I don’t understand the idea of latent features. Could please explain it? Or at least point me to a paper/place where I can read about it?
It seems to me that latent features is a term used to describe criteria for classifying entities by their structure, in other words, by features (traits) they contain, instead of classes they belong to. Meaning of the word "latent" here is most likely similar to its meaning in social sciences, where very popular term latent variable means unobservable variable (concept).
Section "Introduction" in this paper provides a good explanation of latent features' meaning and use in modeling of social sciences phenomena.
Answered by Aleksandr Blekh on July 18, 2021
At the expense of over-simplication, latent features are 'hidden' features to distinguish them from observed features. Latent features are computed from observed features using matrix factorization. An example would be text document analysis. 'words' extracted from the documents are features. If you factorize the data of words you can find 'topics', where 'topic' is a group of words with semantic relevance. Low-rank matrix factorization maps several rows (observed features) to a smaller set of rows (latent features). To elaborate, the document could have observed features (words) like [sail-boat, schooner, yatch, steamer, cruiser] which would 'factorize' to latent feature (topic) like 'ship' and 'boat'.
[sail-boat, schooner, yatch, steamer, cruiser, ...] -> [ship, boat]
The underlying idea is that latent features are semantically relevant 'aggregates' of observered features. When you have large-scale, high-dimensional, and noisy observered features, it makes sense to build your classifier on latent features.
This is a of course a simplified description to elucidate the concept. You can read the details on Latent Dirichlet Allocation (LDA) or probabilistic Latent Semantic Analysis (pLSA) models for an accurate description.
Answered by Dynamic Stardust on July 18, 2021
Suppose you have (MxN)
sparse matrix, where M
-- stands for number of users who gave recommendations, and N
is the number of items recommended. The $x_{ij}$ element of the matrix is the recommendation given, with some elements missing, i.e. to be predicted.
Then your matrix can be "factorized", via introducing K
"latent factors", so that instead of one matrix you have two: (MxK)
--for users, and (KxN)
-- for items, matrix multiplication of which produces the original matrix.
Finally, to your question: what are latent features in matrix factorization? They are unknown features (K
) in user tastes and recommended items, so that when these two matrices multiplied, they produce matrix of known recommendations. Particular weights (of user preferences towards a particular feature and amount of a feature in a particular item) are defined via so called Alternating Least Squares algo, more about which you can read here
Answered by Sergey Bushmanov on July 18, 2021
Another example, consider the case of users to movie rating matrix like the Netflix setup. This will be a huge sparse matrix which is difficult to process.
Note that each user will have a specific preference like sci-fi movies or romance movies etc. So, instead of storing all the movie ratings we could store a single latent feature like the movie category which belongs to different Genre's for example: sci-fi or romance, whichever quantifies his taste for each category. These are called Latent Features, which captures the essence of his taste rather than storing the entire movie list.
Of course this will be an approximation, but on the flip-side, you have a very little to store.
This is usually done using matrix decomposition techniques, like SVD which breaks an $N*N$ user to item recommendation matrix to $N*1$ user preference matrix and $1*N$ item preference matrix, added Advantage is that instead of storing $N^2$ number we effectively store $2N$.
Answered by Sanjay on July 18, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP