TransWikia.com

what is the difference in terms namely Correlation, correlated and collinearity?

Data Science Asked on August 30, 2021

A website says Correlation refers to an increase/decrease in a dependent variable with an increase/decrease in an independent variable. Collinearity refers to two or more independent variables acting in concert to explain the variation in a dependent variable.Could someone clarify the terms ?

2 Answers

Collinearity usually refers to any linear relationship or association between 2 or more features.

Correlation and correlated are more general, and can refer to any type of relationship between features and responses, including log, exponential and linear associations.

The word "correlation" is a noun. And its strength is measured by specific formula that depends upon the data-type and assumptions such as parametric or non-parametric.

The word "correlated" is adjective and indicates loose association between two variables i.e. it does not indicate causal relationship.

Correct answer by Donald S on August 30, 2021

Pearson correlation is the usual correlation when nothing further is specified and specifically refers to linear association.

$$rho_{XY}=dfrac{cov(X,Y)}{sigma_Xsigma_Y}$$

In the world, people use “correlation” to mean any kind of association, but this is wrong from the standpoint of statistics. Arrange points symmetrically on a parabola and run them through that equation; you’ll get zero correlation, despite the obvious relationship.

There also is Spearman correlation, which does Pearson correlation on the ranks of the values.

If the points are $(0,1)$, $(2,4)$, $(3,3)$, the Spearman correlation is calculated by converting the $x$-values to their ranks and the $y$-values to their ranks: $(1,1)$, $(2,3)$, $(3,2)$. Then run the transformed points through the usual equation for (Pearson) correction.

To separate “correlation” and “correlated”, the former is a noun while the latter is an adjective. If there is “correlation” between two variables, they are “correlated”.

Collinear seems to come up in the context of regression and refers to predictor variables that are correlated. The related “multicollinear” means multiple regression predictors that have a linear relationship with another predictor, as if you could regress one predictor on some of the others and get decent accuracy. “Multicollinearity” seems to be the more common term to use when we talk about related predictors, as “collinear” variables strikes me as perfectly related with a correlation of $1$ (think of measurements in both meters in kilometers), while multicollinearity, to me, does not mean a perfect predictive ability unless “perfect” multicollinearity is specified.

“Collinear” and “multicollinear” are adjectives; “collinearity” and “multicollinearity” are the nouns.

Answered by Dave on August 30, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP