How to extract just the random effects part of the formula from lme4

Question

Let us say I have fitted a model such
mymodel <- lmer(Y~X1+(1|fac1)+(1|fac2),mydata)

How can I extract just the random effects part of the formula ((1|fac1)+(1|fac2)) ?
I know I can do
formula(mymodel)[-2]

but this just returns X1 + (1| fac1) + (1| fac2)
I know I could do something with regex but I was hoping there was an easier way.

Daniel · Answer

You can use the insight-package to access various model information, like formula, predictors, data etc. insight provides type-safe "generics" that work for many different models. In this case, you could use find_formula() or find_random():

library(insight)
library(lme4)
data(sleepstudy)
sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE)
sleepstudy$mysubgrp <- NA
for (i in 1:5) {
  filter_group <- sleepstudy$mygrp == i
  sleepstudy$mysubgrp[filter_group] <-
    sample(1:30, size = sum(filter_group), replace = TRUE)
}

m <- lmer(
  Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject),
  data = sleepstudy
)
#> boundary (singular) fit: see ?isSingular

find_formula(m)
#> $conditional
#> Reaction ~ Days
#> 
#> $random
#> $random[[1]]
#> ~1 | mysubgrp:mygrp
#> 
#> $random[[2]]
#> ~1 | mygrp
#> 
#> $random[[3]]
#> ~1 | Subject

find_random(m)
#> $random
#> [1] "mysubgrp:mygrp" "mygrp"          "Subject"       
#> 

find_random(m, split_nested = TRUE)
#> $random
#> [1] "mysubgrp" "mygrp"    "Subject" 

find_random(m, split_nested = TRUE, flatten = TRUE)
#> [1] "mysubgrp" "mygrp"    "Subject"

find_formula() and find_random() also work for models with zero-inflation part that have random effects, e.g. for models from the glmmTMB or brms packages. The "counterpart" to find possible random slopes would be find_random_slopes().

Allan Cameron · Answer

After seeing G Grthendieck's answer I realise I'm probably reinventing the wheel, but here's a way to get the random effects parts out of a model without using regex. It uses recursion to examine each call in the formula's AST and keeps only those in parentheses, which it then rebuilds into an expression. I may be wrong, but this feels safer than switching between character strings and language objects. It could be modified to pull out only | calls.

get_random_effects <- function(mod)
{
  rip_formula <- function(form) 
  {
    
    if(rlang::is_formula(form)) form <- as.list(form)[-c(1:2)][[1]]
    if(is.call(form)) {
      call_list <- as.list(form)
      if(as.character(call_list[[1]]) == "+") 
        return(unlist(lapply(call_list[-1], rip_formula)))
      if(as.character(call_list[[1]]) == "(") 
        return(form)
     } 
  }
  
  re_list <- rip_formula(formula(mod))
  while(length(re_list) > 2) 
    re_list <- c(as.call(list(bquote(`+`), re_list[1:2])), re_list[-(1:2)])
  as.call(list(bquote(`+`), re_list[[1]], re_list[[2]]))
}

So it's now as easy as:

get_random_effects(mymodel)
#> (1 | fac1) + (1 | fac2)

G. Grothendieck · Answer

findbars
The lme4 package provides findbars:
library(lme4)

fo <- Y~X1+(1|fac1)+(1|fac2)

findbars(fo)
## [[1]]
## 1 | fac1
##
## [[2]]
## 1 | fac2

If character strings are needed  we can use the following.  deparse1 will handle certain uncommon cases that deparse fails at but deparse will mostly work as an alternative if it is necessary that this work in versions of R earlier than R 4.0.0.
sapply(findbars(fo), deparse1)
## [1] "1 | fac1" "1 | fac2"

If the desired result is the RHS of the formula but without the fixed effects terms then we can reconstitute the above by adding back the parentheses and using reformulate.  Omit [[2]] if a formula object is desired.  Discussion above regarding deparse1 applies here too.
reformulate(sprintf("(%s)", sapply(findbars(fo), deparse1)))[[2]]
## (1 | fac1) + (1 | fac2)

terms/labels
Another way to get character result is to use labels which will extract them from terms .  Use reformulate, as above, if a formula is desired.  This does not use any packages.
X <- grep("|", labels(terms(fo)), fixed = TRUE, value = TRUE)
X
## [1] "1 | fac1" "1 | fac2"

As above, the formula and right hand side of it can be generated from X like this:
reformulate(sprintf("(%s)", X))
reformulate(sprintf("(%s)", X))[[2]]

getTerms
Another approach is to use getTerms from Terms of a sum in a R expression  This short function recursively walks the formula to extract the terms. It does not use any packages.
XX <- grep("|", sapply(getTerms(fo[[3]]), deparse1), fixed = TRUE, value = TRUE)
XX
## [1] "(1 | fac1)" "(1 | fac2)"

The formula and right hand side of it can be generated like this:
reformulate(XX)
reformulate(XX)[[2]]

NelsonGon · Answer

A simple solution that may not generalise:
# This model may not make much sense, just for reproducibility
mymodel <- lmer(Petal.Length~Sepal.Width+(1|Species) + (1|Petal.Width),iris)    
stringr::str_extract_all(formula(mymodel),"$.*$")[3]
    [[1]]
    [1] "(1 | Species) + (1 | Petal.Width)"

To "automatically" drop all empty elements:
purrr::compact(stringr::str_extract_all(formula(mymodel),"$.*$"))
[[1]]
[1] "(1 | Species) + (1 | Petal.Width)"

Base_R_Best_R · Answer

No regex required , however, it's still string manipulation.
# stringsplit the output of your formula()
# remove the first entry 
# remove spaces with gsub()
# paste it back together

inp <- "X1 + (1| fac1) + (1| fac2)"

paste(gsub(" ", "", unlist(strsplit(inp, "+", fixed = T))[-1], fixed = T), 
      collapse = " + ")

# [1] "(1|fac1) + (1|fac2)"

How to extract just the random effects part of the formula from lme4

5 Answers

findbars

terms/labels

getTerms

Add your own answers!

Ask a Question