TransWikia.com

How to avoid repeating code in dplyr::mutate() call with multiple arguments?

Stack Overflow Asked by RemPsyc on November 17, 2021

Problem

I am transitioning to dplyr from base R.

I would like to shorten the following code to respect the DRY (Don’t Repeat Yourself) principle:

mtcars %>% mutate(w = rowMeans(select(., mpg:disp), na.rm = TRUE),
                  x = rowMeans(select(., hp:wt), na.rm = TRUE),
                  y = rowMeans(select(., qsec:am), na.rm = TRUE),
                  z = rowMeans(select(., gear:carb), na.rm = TRUE))

or

mtcars %>% rowwise() %>% mutate(w = mean(mpg:disp, na.rm = TRUE),
                                x = mean(hp:wt, na.rm = TRUE),
                                y = mean(qsec:am, na.rm = TRUE),
                                z = mean(gear:carb, na.rm = TRUE))
# Note: this one produced an error with my own data

Goal

The goal is to compute the means of different scales in a data frame from a single call. As you can see, the rowMeans, select, and na.rm arguments repeat several times (imagine I have several more variables than for this example).

Attempts

I was trying to come up with an across() solution,

mtcars %>% mutate(across(mpg:carb, mean, .names = "mean_{col}"))

But it doesn’t produce the correct outcome because I don’t see how to specify different column arguments for w:z. Using the c_across from the documentation example and we are back to repeating code:

mtcars %>% rowwise() %>% mutate(w = mean(c_across(mpg:disp), na.rm = TRUE),
                                x = mean(c_across(hp:wt), na.rm = TRUE),
                                y = mean(c_across(qsec:am), na.rm = TRUE),
                                z = mean(c_across(gear:carb), na.rm = TRUE))

I am tempted to resort to lapply or a custom function but I feel like it would be defeating the purpose of adapting to dplyr and the new across() argument.

Edit: To clarify, I want to avoid calling rowMeans, select, and na.rm more than once.

Related threads: 1, 2, 3.

2 Answers

Use a custom function (but organize it a bit differently to reduce repeating code)

mm <- function(data, new_col, cols_to_mut) {
    data %>%
        mutate(
            {{ new_col }} := mean(c_across({{ cols_to_mut }}), na.rm=TRUE)
        )
}

mtcars %>% 
    rowwise %>% 
    mm(w, mpg:disp) %>%
    mm(x, hp:wt) %>%
    mm(y, qsec:am) %>%
    mm(z, gear:carb) %>%
    ungroup

Answered by CPak on November 17, 2021

We don't need rowwise, instead use select with rowMeans which is vectorized. In order to make this easier, a function can be created

f1 <- function(dat, nm1) {
          dat %>%
            select({{nm1}}) %>%
             rowMeans(na.rm = TRUE)
    }

mtcars %>% mutate(w = f1(dat = ., nm1 = mpg:disp),
                  x = f1(dat = ., nm1 = hp:wt),
                  y = f1(dat = ., nm1 = qsec:am),
                  z = f1(dat = ., nm1= gear:carb)  )

Answered by akrun on November 17, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP