Cross Validated Asked on December 8, 2021
I have searched what is multivariate data and obtained the following definitions which are confusing me.
Def 1: Multivariate data is multiple dimensional data i.e more than 1 independent variables
Def 2: Multivariate data is having multiple responses i.e more than one respose
Def 3: Multivariate data is multiple dimensional data i.e more than 1 independent variables and considers the relationship among the independent variables
which one is correct and what actually is a multivariate data?
I tend to agree with definition 3.
If there is no interdependence or relation between the multiple independent variables, then many univariate models can be used as well without sacrificing anything.
The notion of multiple variables restrictively as "response" --- i.e. there has to be a predictor --- contradicts what I'm taught since books on multivariate analysis (i.e. by Johnson and Wichern, 2007) also explains principal component analysis (PCA), factor analysis (FA), clustering, and discriminant analysis as multivariate methods. Therefore I'm inclined to say that "multivariate" has more to do with "multi" and "interdependence" rather than strictly "response".
But since you tag this under machine learning, multivariate models in machine learning usually just (not always) mean you have more than one output to predict.
Answered by Nuclear03020704 on December 8, 2021
From what I was taught, multivariate data has more than 1 response variable. This link explains it in detail, and contrasts it with univariate (1 variable) and bivariate (2 variables) data.
A common example of multivariate data is community data, i.e. the abundances of a range of different species.
Answered by Pitto on December 8, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP