Cross Validated Asked by rnso on August 13, 2020
I am using commonly available iris dataset and trying to do following regression:
PW ~ PL + SL + SW
Since samples are taken from 3 "Species", this is kept as random or group variable.
The results of Linear Mixed Regression are:
Mixed Linear Model Regression Results
=====================================================
Model: MixedLM Dependent Variable: PW
No. Observations: 150 Method: REML
No. Groups: 3 Scale: 0.0278
Min. group size: 50 Log-Likelihood: 41.4680
Max. group size: 50 Converged: Yes
Mean group size: 50.0
-----------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
-----------------------------------------------------
Intercept 0.082 0.335 0.245 0.807 -0.575 0.740
SL -0.098 0.045 -2.199 0.028 -0.186 -0.011
SW 0.238 0.048 4.975 0.000 0.144 0.332
PL 0.257 0.050 5.139 0.000 0.159 0.355
Group Var 0.257 1.636
=====================================================
While the results of GEE regression are:
GEE Regression Results
===================================================================================
Dep. Variable: PW No. Observations: 150
Model: GEE No. clusters: 3
Method: Generalized Min. cluster size: 50
Estimating Equations Max. cluster size: 50
Family: Gaussian Mean cluster size: 50.0
Dependence structure: Independence Num. iterations: 2
Date: Thu, 16 Jul 2020 Scale: 0.037
Covariance type: robust Time: 02:42:49
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -0.2403 0.151 -1.595 0.111 -0.536 0.055
SL -0.2073 0.088 -2.349 0.019 -0.380 -0.034
SW 0.2228 0.073 3.036 0.002 0.079 0.367
PL 0.5241 0.049 10.711 0.000 0.428 0.620
==============================================================================
Skew: 0.2232 Kurtosis: 0.9437
Centered skew: -0.2824 Centered kurtosis: 1.2493
==============================================================================
=============== cov_struct.summary() ===============
Observations within a cluster are modeled as being independent.
Although P-values for all 3 predictor variables are significant in both, they are different in 2 analyses.
Moreover, the coefficients are quite different:
Which of these analyses is more appropriate and acceptable? Thanks for your insight.
When I fit these models in R I get very similar estimates to those that you obtained:
> data("iris")
> # lmm
> m.lmm <- lmer(Petal.Width ~ Sepal.Length + Sepal.Width + Petal.Length + (1|Species), data = iris)
> m.gee <- geeglm(Petal.Width ~ Sepal.Length + Sepal.Width + Petal.Length, id = Species, data = iris, corstr = "independence")
> summary(m.lmm)
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.0821 0.3356 0.24
Sepal.Length -0.0984 0.0444 -2.22
Sepal.Width 0.2380 0.0477 4.99
Petal.Length 0.2567 0.0478 5.37
> summary(m.gee)
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) -0.2403 0.1506 2.55 0.1106
Sepal.Length -0.2073 0.0882 5.52 0.0188 *
Sepal.Width 0.2228 0.0734 9.22 0.0024 **
Petal.Length 0.5241 0.0489 114.72 <2e-16 ***
The diffeence is mostle due to using independence
as the correlation structure. To be equivalent to the mixed model you should use exchangable
:
> m.gee1 <- geeglm(Petal.Width ~ Sepal.Length + Sepal.Width + Petal.Length, id = Species, data = iris, corstr="exchangeable")
> summary(m.gee1)
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) 0.0767 0.1960 0.15 0.695
Sepal.Length -0.1015 0.0254 16.02 6.3e-05 ***
Sepal.Width 0.2357 0.0958 6.06 0.014 *
Petal.Length 0.2647 0.0332 63.45 1.7e-15 ***
Exchangeable correlation structure means that the residual covariance between all species is the same, which is the same assumption as in mixed effects models.
Correct answer by Robert Long on August 13, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP