Cross Validated Asked on December 6, 2021
It is generally recommended that baseline should not be kept as predictor if change is outcome variable. Explanations for this have taken both baseline and final values as random (e.g. see here). However, final value is likely to be close to and related to initial value. It is the change
(effect of drug or intervention) that can be taken as random. Then change is not "mathematically coupled" with baseline.
If we take baseline
and change
as random values (rather than baseline
and final
values), keeping baseline
as a predictor of change
does not seem give spuriously positive results:
N <- 200
x1 <- rnorm(N, 50, 10)
trt <- c(rep(0, N/2), rep(1, N/2)) # allocate to 2 groups
change <- rnorm(N, 10, 5)
summary(lm(change ~ x1 * trt))
Output of above is:
Call:
lm(formula = change ~ x1 * trt)
Residuals:
Min 1Q Median 3Q Max
-13.5833 -3.3792 -0.0617 3.5979 16.7672
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.32244 3.07271 2.058 0.041 *
x1 0.08295 0.06260 1.325 0.187 << NOTE: NOTHING IS SIGNFICANT
trt 3.86000 4.14419 0.931 0.353
x1:trt -0.08836 0.08204 -1.077 0.283
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.256 on 196 degrees of freedom
Multiple R-squared: 0.01061, Adjusted R-squared: -0.004531
F-statistic: 0.7008 on 3 and 196 DF, p-value: 0.5526
Of course, we should not keep baseline as predictor if we take "percent_change" (100*change/baseline)
as outcome variable.
What is the fallacy in above example? Why it is recommended that baseline cannot be kept as predictor in regression equations where change is outcome variable?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP