Why does the same parameter estimate change in multiple regressions?

Question

I have a set of results from a multiple regression table with four columns. In each column, a new variable is added. Why, however, is the initial variable coefficient different in each column? Is it due to the coefficients being jointly estimated?

1muflon1 · Accepted Answer

Yes, it is because they are estimated jointly and the point estimates are conditional on other point estimates and also because of omitted variable bias. To be more specific, contrast an example of  univariate OLS with bivariate OLS, both of which are estimated by minimizing the sum of squared residuals, but resulting formulas for estimated coefficients are different.
In univariate OLS by minimizing squared residuals we get
$$min_{b_0,b_1}sum e^2 = sum (y_i-b_0-b_1x_{1i})^2 implies hat{b_1} = frac{sum (x_{1i}-bar{x_1})(y_i-bar{y})}{sum (x_{1i}-bar{x_1})^2}$$
In a bivariate regression the result will be different
$$min_{b_0,b_1, b_2}sum e^2 = sum (y_i-b_0-b_1x_{1i}-b_2x_{2i})^2 implies hat{b_1} = frac{sum (x_{2i}-bar{x_2})^2sum (x_{1i}-bar{x_1})(y_i-bar{y})- sum  (x_{1i}-bar{x_1})(x_{2i}-bar{x_2})sum  (x_{2i}-bar{x_2})(y_i-bar{y})}{sum (x_{1i}-bar{x_1})^2 sum (x_{2i}-bar{x_2})^2- sum  (x_{1i}-bar{x_1})(x_{2i}-bar{x_2})}$$
Hence we can see that the difference is that in the second expression the calculation of $b_1$ critically depends on what the covariance of $x_1$ and $x_2$ is (i.e. $sum  (x_{1i}-bar{x_1})(x_{2i}-bar{x_2})$), what variance of $x_2$ is (i.e. $sum (x_{2i}-bar{x_2})^2$ and also what covariance of $x_2$ and $y$ is (i.e. $sum  (x_{2i}-bar{x_2})(y_i-bar{y})$) and hence $hat{b_1}$, hence including additional variable will change the $b_1$ (and also other coefficient) formula. This will extend to multiple variables as well with formula getting progressively more complex.
Moreover, the coefficients will change substantially especially if there is some omitted variable bias present. If we assume that the true model is given by:
$$y= beta_0 + beta_1 x_1 +beta_2 x_2 + u$$
but you will try to fit univariate model regardless of the above.
$$hat{y} = hat{beta_0} + hat{beta_1} x_1$$
In such conditions it can be shown that even though we fit univariate regression the expected estimated beta will be (see Wooldridge Introduction to Econometrics):
$$E[hat{beta_1}]= beta_1 + beta_2 frac{sum  (x_{1i}-bar{x_1})(x_{2i}-bar{x_2})}{sum (x_{1i}-bar{x_1})^2}$$
Where the second term is the omitted variable bias. The intuition here is that if there is any covariance between $x_1$ and $x_2$ and if $y$  actually depends also on  $x_2$ (i.e. $beta_2 neq 0$) then if you apply univariate regression on bivariate relationship the estimate of $beta_1$ will be biased as it partially captures the relationship of $y$ and $x_2$. For example,  if $beta_1=0.1$, $beta_2= -2$ and $cov(x_1,x_2)=0.5$ and $var(x_1)=2$ then if you estimate univariate regression the estimated $hat{beta_1} = -0.4$ whereas in bivariate regression it would be $hat{beta_1}=0.1$ and $hat{beta_2}=-2$
Hence to sum up, the coefficients are different because they are in multivariate regression calculated conditional on other coefficients and also if you are omitting an relevant independent variable its effect will be 'hidden' in the estimates of non-omitted variable (if they are correlated).

Why does the same parameter estimate change in multiple regressions?

One Answer

Add your own answers!

Ask a Question