Why we cannot take baseline as predictor for change in this case

Question

It is generally recommended that baseline should not be kept as predictor if change is outcome variable. Explanations for this have taken both baseline and final values as random (e.g. see here). However, final value is likely to be close to and related to initial value. It is the change (effect of drug or intervention) that can be taken as random. Then change is not "mathematically coupled" with baseline.
If we take baseline and change as random values (rather than baseline and final values), keeping baseline as a predictor of change does not seem give spuriously positive results:
N <- 200
x1 <- rnorm(N, 50, 10)
trt <- c(rep(0, N/2), rep(1, N/2))  # allocate to 2 groups
change <- rnorm(N, 10, 5)
summary(lm(change ~ x1 * trt))

Output of above is:
Call:
lm(formula = change ~ x1 * trt)

Residuals:
     Min       1Q   Median       3Q      Max 
-13.5833  -3.3792  -0.0617   3.5979  16.7672

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  6.32244    3.07271   2.058    0.041 *
x1           0.08295    0.06260   1.325    0.187     << NOTE: NOTHING IS SIGNFICANT
trt          3.86000    4.14419   0.931    0.353  
x1:trt      -0.08836    0.08204  -1.077    0.283  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5.256 on 196 degrees of freedom
Multiple R-squared:  0.01061,   Adjusted R-squared:  -0.004531 
F-statistic: 0.7008 on 3 and 196 DF,  p-value: 0.5526

Of course, we should not keep baseline as predictor if we take "percent_change" (100*change/baseline) as outcome variable.
What is the fallacy in above example? Why it is recommended that baseline cannot be kept as predictor in regression equations where change is outcome variable?

Why we cannot take baseline as predictor for change in this case

Add your own answers!

Ask a Question