How to evaluate if year-over-year data is correlated and predictive

Question

I'm working with data that is separated by year and classification and want to test if the data is correlated over each year. Essentially, I want to determine if the data is random each year or one year's data is correlated to the previous and possibly predictive of the future.

What statistical test is best for this?

For example:
1

Using this data, I would off-hand assume that there is correlation for users a & c. However, user b bounces between .4, .1 and .8 and thus would assume his yearly performances are fairly random.

Any and all help is appreciated! And let me know if any further clarification is helpful.

Michael Grogan · Answer

What you are describing is known as serial correlation, where correlations exist between residuals at time t and t-1.

While plotting an ACF plot to determine the degree of autocorrelation across lags can help, you should formally test for this condition using the Durbin-Watson test, where:

H0: No serial correlation present

HA: Serial correlation present

The Durbin Watson statistic is calculated as follows:

Note that this is assuming your dataset is purely time series. If it is a panel dataset, i.e. both cross-sectional and time series, then you may need to conduct panel modelling instead, or at least test for serial correlation across the cross-sections separately.

How to evaluate if year-over-year data is correlated and predictive

One Answer

Add your own answers!

Ask a Question