Mixed Effects Model

Question

Sorry for asking my question even though I know there are some subjects about mixed effect model on the forum. But I think my question is somewhat different.
I have to answer to a question about repeated measure.
It is a group of people followed for a treatment against depression: 146 people (Men an women), 8 times of measure for each subject. I have to answer about if treatment works better in one gender group compare to the other.
My variables of interest are ScoreHamilton (Score used to assess depression state), GROUPE (Gender: male or female),TEMPS (Different times of visit),NUMERO (Subjects ID)
I know I have to used mixed effect model, but I am not sure if my scripts (below) are correct.

modMix_H0 <- lme(ScoreHamilton ~ TEMPS + GROUPE,
                 random = ~1+TEMPS|NUMERO,
                 data = Ham_norm.mix)`

I fitted variables TEMPS (time) and GROUPE (Gender) like fixed effects and NUMERO (Subjects) like random effect. I am wondering if that is right.

I hesitate a little about the way I made random effect. I tried to do random intercept and random slope like this~1+TEMPS|NUMERO cause I noticed that people making random effects used to do like this ~1+TIME|ID (in general). Now I am wondering why I cannot put in random terms my variable GROUPE, something like this ~1+GROUPE|NUMERO, or like this ~1+TEMPS+GROUPE|NUMERO.

The other part of my question is the interpreting of the output.
Here are the results of the summary of the model:

summary(modMix_H0)

Linear mixed-effects model fit by REML
 Data: Ham_norm.mix 
       AIC      BIC    logLik
  6628.782 6663.471 -3307.391

Random effects:
 Formula: ~1 + TEMPS | NUMERO
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev     Corr  
(Intercept) 4.73695760 (Intr)
TEMPS       0.08200003 -0.353
Residual    4.72718973

Fixed effects: ScoreHamilton ~ TEMPS + GROUPE 
                Value Std.Error  DF   t-value p-value
(Intercept) 22.989933 0.5959364 905  38.57783   0e+00
TEMPS       -0.352266 0.0109268 905 -32.23866   0e+00
GROUPEHomme  2.952001 0.8013428 144   3.68382   3e-04
 Correlation: 
            (Intr) TEMPS 
TEMPS       -0.359       
GROUPEHomme -0.652  0.012

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-2.66404151 -0.58774912  0.02206275  0.56281247  3.97325207

Number of Observations: 1052
Number of Groups: 146

I don't know how to interpret all of the parameters, how they could influence the interpreting of my final result (that is, the impact of GROUPE on the Score Hamilton), and the quality of my model .

Though, the way I interpret this result is that the score is significantly higher in men (Homme) than in women. So, the treatment improve better the mental state in women (lowest score), a result I was not expecting for. This make me wondering about about the way I computed the model.

I have additional questions. My variable VISIT was factor which I turned into numeric. Could it change something  whether my variable VISIT is factor or numeric?
Could it change something about my results whether I used na.omit or not in the model, since my dataset has a lot of missing values?

Robert Long · Accepted Answer

I fitted variables TEMPS (time) and GROUPE (Gender) like fixed effects and NUMERO (Subjects) like random effect. I am wondering if that is right.

Yes, you have repeated measures within individuals so fitting random intercepts for this factor accounts for the correlation between measurements within each individual.

Now I am wondering why I cannot put in random terms my variable GROUPE, something like this ~1+GROUPE|NUMERO, or like this ~1+TEMPS+GROUPE|NUMERO

You can. The variables to the left of the | symbol specify random slopes. This means that you are allowing whatever variable appears on the left side to vary within whatever is on the right side. So in your case 1+GROUPE|NUMERO means that the effect of gender can be different for each individual. You should ask yourself whether this makes sense within your particular specialty (since gender usually does not change within individuals, I expect that this would not make sense). ~1+TEMPS+GROUPE|NUMERO additionally allows the effect of time to be different for each individual.

Though, the way I interpret this result is that the score is significantly higher in men (Homme) than in women. So, the treatment improve better the mental state in women (lowest score), a result I was not expecting for. This make me wondering about about the way I computed the model.

Yes, your interpretation is correct. The estimate for GROUPEHomme can be interpreted as the difference in ScoreHamilton between the reference level for GROUPE and for GROUPE=Homme, where the other fixed effects remain constant, and conditional on the random effects estimated.

I have additional questions. My variable VISIT was factor which I turned into numeric. Could it change something whether my variable VISIT is factor or numeric?

Yes it could. I don't see VISIT in your model. From your question it appears that TEMPS is the variable that was created from VISIT. I am assuming that the original factor variable had levels such as "10 days", "20 days", "25 days" "..." and you converted these to a numeric variable 10, 20, 25, ... By doing this your model estimates a linear effect for TEMPS. By keeping the variable as a factor then you allow for non linearity. If there are few levels / values then this does not matter too much, but if you have many levels then retaining the variable as a factor will lead to a model with many estimates for each level, which becomes difficult to interpret. If you want to allow for non-linearity, one way is to use the numeric variable and specify a quadratic (and higher order) terms for it, or to use splines. Since you have 8 measurement occasions, I would be inclined to use the numeric variable with splines.

Could it change something about my results whether I used na.omit or not in the model, since my dataset has a lot of missing values?

In lme missing data causes an error, so using na.omit = TRUE is the only way to make it run. This removes rows containing any missing data, and this can lead to substantial bias. Depending on the reasons for missingness and the extent of missingness, you would be well-advised to consider using multiple imputation to address this problem.

Final note: nlme is an old package. lme4 was subsequently developed by the same people and is a better choice in most situations.

Mixed Effects Model

One Answer

Add your own answers!

Ask a Question