Can I use different data types in a general linear mixed effects model?

Question

I've become quite familiar with linear mixed effects models but I'm not very certain about their General counterparts.
I have a data set which looks like the following:
      PP Age Gender       Education    Student Familiarity         ID Correct
5   pp1  22   Male          PostGr    Student         Yes widgetTime       1
6   pp1  22   Male          PostGr    Student         Yes  racePlace       1
7   pp1  22   Male          PostGr    Student         Yes  emilyName       1
8  pp10  24   Male BachelorsDegree NotStudent         Yes  emilyName       1
9  pp10  24   Male BachelorsDegree NotStudent          No  farmSheep       1
10 pp10  24   Male BachelorsDegree NotStudent         Yes   dirtHole       1

Participants (PP) were asked to answer seven questions (ID), where their answer was Correct (1) or incorrect (0). For every question, they were asked if they had ever seen that question before (Familiarity = Yes, No). Within the data set, I also have demographics such as Age, Gender (2 levels), Education (3 levels) and Student (2 levels).
Particularly, I want to know whether being Familiar (Familiarity) with the question (ID) would lead to more Correct answers. My first assumption was to use a Chi Square (Familiarity x Correct) but I was told it was not the best measure as, if I recall, it assumes independence. I was told a general linear mixed effects model using R's glmer function would help, as I can also include random effects (PP and ID). I know that a GLMM is more suited for this data as the DV is binomial (0,1).
My question is: can I use all the predictors (contrast coded as -0.5, 0.5 (except for Education, which is 1, 1, -2 & 1, -1, 0)) in my glmer formula, even though Age is numeric? I'm used to only using categorical variables in my lmer functions. I assume my formula would be as follows:
glmer.model=glmer(Correct~Familiarity + Age + Gender + Education + Student + (1|PP) + (1|ID), family = "binomial", DemoScore)

Should I also be factoring the Familiarity variable (and therefore contrast code) or convert it into 0 and 1?
I initially thought summing the Correct scores for each PP would make more sense, but this does not as Familiarity is for question-by-question.
Any clarification would be highly appreciated.

Robert Long · Answer

can I use all the predictors (contrast coded as -0.5, 0.5 (except for Education, which is 1, 1, -2 & 1, -1, 0)) in my glmer formula, even though Age is numeric?

Yes, there are no particular concerns when including different types of variables in a glmm. From this point of view there is no difference to other types of regression model.

I assume my formula would be as follows:

> glmer.model=glmer(Correct ~ Familiarity + Age + Gender + Education + Student + (1|PP) + (1|ID), family = "binomial", DemoScore)

This model seems to make sense. If all participants saw all questions then you have a crossed design and (1|PP) + (1|ID) is the right way to specify the random structure.

Should I also be factoring the Familiarity variable (and therefore contrast code) or convert it into 0 and 1?

It's not completely clear what you mean. From your data listed in the question Familiarity seems to be a factor with Yes and No levels. This will be treated the same as if you recode it to 0/1.

I initially thought summing the Correct scores for each PP would make more sense, but this does not as Familiarity is for question-by-question.

That's correct. Summing or averaging the scores would lose important informaton.

Can I use different data types in a general linear mixed effects model?

One Answer

Add your own answers!

Ask a Question