groups, levels and denominator dof in mixed effect models

Question

I am trying very hard (I am not a statistician) to understand the concepts of "groups" and "levels" in mixed effect models. In particular, I am trying to understand this in the context of the denominator degrees of freedom of the fixed effect terms of a model. I know that denDF is an open question in mixed effect models, so for the sake of the current discussion I would refer to the estimated denDFs based on the inner-outer rule proposed by Pinheiro and Bates (pag. 91) and summarized here.

First off, can anyone define what "levels" and "groups" are in mixed effect model? My understanding is that groups are defined by the structure of the random effects (and therefore by the model). However, I am confused with the definition of levels. Also, is there any relationship between levels and groups? In other words, can I modify the number of groups (by changing the structure of the random effects) independently on the levels. Are levels inherently defined by the structure of the data, or they emerge from the definition of the model?

Regarding the definition of denDF. Pinheiro and Bates define:

$denDF_i = m_i -(m_{i-1}+p_i), i=1,...,Q+1$

where $m_i$ denotes the total number of groups in level $i$, $p_i$ denotes the number of degrees of freedom corresponding to the terms estimated at level $i$, and $Q$ is the number of grouping factors. My questions are:

This definition is highly dependent on the definition of levels. How do I know what groups belong to what level?
According to this definition, there seems to be $Q+1$ denDFs. But I can change the number of grouping factors by changing the structure of the random effects, while I always need as many denDFs as the number of fixed effect terms to be estimated plus 1 (the intercept). For example, in the Machines example from the book, I could define random=~1|Worker (Q=1, hence 2 denDFs) or random=~1|WorkerMachines (Q=2, hence 3 denDFs). Wouldn't I need only two denDFs in both models, Intercept and Machine? Why does this depend on the definition of the grouping?
I cannot understand the logic of this formula. Especially, why do I need to subtract $p_i$?

Finally, is there a rule of a thumb to figure out how to define groups (and therefore random effect structure) in a model? This seems to be a very important design decision: the definition of the groups influences the denDF, and therefore the results of the hypothesis tests on the fixed effect terms.

I do apologize for my naive questions, but I am really trying to understand this. If this is too basic for this forum, can you give me specific references where I can read about these issues? Thanks a lot for your help.

Robert Long · Answer

This is a great question and I am not surprised about the confusion.

I am writing this as an answer, as it is too long for a comment. I am only going to address the first question here. Perhaps the remaining questions will be redundant if it makes sense....

First off, can anyone define what "levels" and "groups" are in mixed effect model?

In short, there are "levels" of the grouping factor (the "groups") and there are "levels" of nesting (0,1,2 etc), which are completely different.

In a bit more detail, "groups" are the unique items within a "grouping factor" which is generally a categorical variable. For example, if we had observations on patients within a ward of a hospital, the "grouping factor" (just a variable name in the dataset) would be something like wardID. "groups" refers to the individual wards, so if there were 20 wards in the sample, then there would be 20 groups.

Things start to get confusing when talking about "levels". In Pinheiro and Bates they are discussing models with nested grouping factors, and when they talk about levels they are talking about the relative level of nesting. Sticking with the patients in wards scenario. If we now have multiple hospitals, then wards are nested within hospitals - so any individual ward "belongs" to one and only one hospital. By convention, the lower level (wards in this case) is 1. So the hospital level is level 2, if there was a further level of nesting, say City, that would be level 3. Hopefully that makes sense. The source of confusion is that within any particular grouping factor, the individual entities (eg individual hospitals) are also often referred to as "levels". So in some contexts, "groups" and "levels" can be used interchangeably. So, in R, if we refer back to the Machines example in nlme, and look at the structure of the Machine grouping variable we obtain:

> str(Machines$Machine)
 Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...

So it is necessary to understand the context of what they are talking about in the book. For instance when they say:

grouping level at which the term is estimated

they are talking about level 1 or level 2 of the nesting structure (wards or hospitals in my example of worker and machine in theirs). On the other hand, when they say:

A term is outer to a grouping factor if its value does not changes within levels of the grouping factor

they are talking about the individual levels (individual wards or hospitals) within a grouping factor.

I hope this helps !

groups, levels and denominator dof in mixed effect models

One Answer

Add your own answers!

Ask a Question