Economics Asked on February 24, 2021
I am estimating an autoregressive distributed lag model, and I’ve read that I must determine the lag length of my autoregressive term separately from the lag length of the other regressors in the model.
For instance, given the model $y_t=delta_0 + alpha_1y_{t-1} + alpha_2y_{t-2} + alpha_3y_{t-3} + gamma_1z_{t-1} + gamma_2z_{t-2} + u_t$
Wooldridge 18-5 (p.590 in 6th edition) writes, ” As a practical matter, how do we decide on which lags of $y$ and $z$ to include? First, we start by estimating an autoregressive model for $y$…Once an autoregressive model for $y$ has been chosen, we can test for lags of $z$.”
But, there’s no explanation on why we must test for lags of $y$ and then lags of $z$. Can you provide me with an explanation of why we do this?
Many thanks-
Maurus
The reason for this is that in a most dynamic models, i. e. models where any lags of dependent variables are used, you always have to include enough lags to get rid of autocorrelation.
This is because in dynamic model autocorrelation no longer can be adjusted just by correcting standard errors, because dynamic models in presence of autocorrelation are not just inefficient but also biased. Hence in a dynamic model it is not enough to just use let’s say Newey-West or other autocorrelation consistent errors, you need to get the dynamic specification (the lags) right so there is no autocorrelation in the model.
More formally you can see the reason why this is necessary by considering simple ARDL(1,1) mode:
$$y_t = beta_1 + beta_2y_{t-1} + beta_3 x_{t-1} + epsilon_t$$
now imagine that in this specification you still got autocorrelation of first order, which would imply that:
$$epsilon_t= gamma epsilon_{t-1} + u_t$$
Now substitute the autocorrelated error back to original equation and you will get:
$$y_t = beta_1 + beta_2y_{t-1} + beta_3 x_{t-1} + gamma epsilon_{t-1} + u_t$$
Now if if we lag the entire first equation by one period we get:
$$y_{t-1} = beta_1 + beta_2y_{t-2} + beta_3 x_{t-2} + epsilon_{t-1} $$
Hence you can see that in presence of autocorrleation the lagged error term $epsilon_{t-1}$ causes not just present error $epsilon_t$ but also lagged dependent variable $y_{t-1}$. Hence this violates the Gauss-Markov assumptions of OLS, to be specific assumption that error term and regressors should not be correlated and in this case estimator is not salvageable as in standard cases where autocorrelation just causes error to be correlated with past errors.
Answered by 1muflon1 on February 24, 2021
After some of my own back-and-forth, I'm going to try and answer my own question. I hope I might get some comments on whether or not I'm correct. If I'm understanding 1muflon1's response correctly, they provided a very useful explanation of why it's important to get the number of lags correct in an ARDL, but as Michael pointed out, that wasn't my question (though I learned from 1muflon1's response, nevertheless.) I take responsibility for this though, because my question confused the steps for an ARDL model with Wooldridge's specific purpose in the section I referenced (18-5b, p.590 in 6th edition).
The context of the section I referenced in Wooldridge is Granger causality. In other words, we want to know whether past values of a random variable $z$ ($z_{t-1}...z_{t-h}$) provide useful information for predicting the current value of another random variable, $y$ ($y_t$), even after we control for past values of $y$ ($y_{t-1}...y_{t-h}$).
Our question necessarily informs the steps of our method. We first establish how many past values of $y$ give us useful information to predict $y_t$, then we use a linear model to test whether after controlling for the information provided by those past values of $y$, any lags of $z$ result in a non-zero coefficient (where, if they do not, then we say that $z$ "Granger causes" $y$.)
Answered by M.M. on February 24, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP