TransWikia.com

Can the dependency between variables be deduced from data? And if so, how?

Data Science Asked by coliva on April 1, 2021

I have a data set $X$ that consists of $m$ vectors $vec{x}$ of $n$ real-valued components. Each vector component lies within a corresponding predefined interval of valid values, which is the same for all vectors in $X$. The assumption is that there exists a dependency graph between the components of each vector, which is also the same for all vectors; for example, the value of the component $x_k$ (maybe) depends on the values of both components $x_p$ and $x_q$ for all $vec{x} in X$ and $k neq p neq q$. However, we do not know the exact structure of this graph. In other words, we suppose that there exists a dependency between the variables of each data point, but we don’t know how it is structured. So the problem is to deduce this dependency graph using only the data available in $X$.

My question is: is there a method or algorithm that allows me to solve this problem? And if so, can somebody point me to a source where I can learn more about it?

One Answer

Yes, We can do that. We can measure the linear relation between two variables by correlation metric to know more see this. If you want to identify the linear relationship between more than two variables we can do it by VIF scores. If a VIF score of a variable is >5 we can say this variable can be expressed as linear combination of other variables to know more see this. If you want to implement in python consider this.

Answered by Venkatesh Gandi on April 1, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP