neversaint
neversaint

Reputation: 64014

How to use dplyr::mutate_all for rounding selected columns

I'm using the following package version

# devtools::install_github("hadley/dplyr")
> packageVersion("dplyr")
[1] ‘0.5.0.9001’

With the following tibble:

library(dplyr)
df  <- structure(list(gene_symbol = structure(1:6, .Label = c("0610005C13Rik", 
"0610007P14Rik", "0610009B22Rik", "0610009L18Rik", "0610009O20Rik", 
"0610010B08Rik"), class = "factor"), fold_change = c(1.54037, 
1.10976, 0.785, 0.79852, 0.91615, 0.87931), pvalue = c(0.5312, 
0.00033, 0, 0.00011, 0.00387, 0.01455), ctr.mean_exp = c(0.00583, 
59.67286, 83.2847, 6.88321, 14.67696, 1.10363), tre.mean_exp = c(0.00899, 
66.22232, 65.37819, 5.49638, 13.4463, 0.97043), ctr.cv = c(5.49291, 
0.20263, 0.17445, 0.46288, 0.2543, 0.39564), tre.cv = c(6.06505, 
0.28827, 0.33958, 0.53295, 0.26679, 0.52364)), .Names = c("gene_symbol", 
"fold_change", "pvalue", "ctr.mean_exp", "tre.mean_exp", "ctr.cv", 
"tre.cv"), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"))

That looks like this:

> df
# A tibble: 6 × 7
    gene_symbol fold_change  pvalue ctr.mean_exp tre.mean_exp  ctr.cv  tre.cv
         <fctr>       <dbl>   <dbl>        <dbl>        <dbl>   <dbl>   <dbl>
1 0610005C13Rik     1.54037 0.53120      0.00583      0.00899 5.49291 6.06505
2 0610007P14Rik     1.10976 0.00033     59.67286     66.22232 0.20263 0.28827
3 0610009B22Rik     0.78500 0.00000     83.28470     65.37819 0.17445 0.33958
4 0610009L18Rik     0.79852 0.00011      6.88321      5.49638 0.46288 0.53295
5 0610009O20Rik     0.91615 0.00387     14.67696     13.44630 0.25430 0.26679
6 0610010B08Rik     0.87931 0.01455      1.10363      0.97043 0.39564 0.52364

I'd like to round the floats (2nd columns onward) to 3 digits. What's the way to do it with dplyr::mutate_all()

I tried this:

cols <- names(df)[2:7]
# df <- df %>% mutate_each_(funs(round(.,3)), cols)
# Warning message:
#'mutate_each_' is deprecated.
# Use 'mutate_all' instead.
# See help("Deprecated") 

df <- df %>% mutate_all(funs(round(.,3)), cols)

But get the following error:

Error in mutate_impl(.data, dots) : 
  3 arguments passed to 'round'which requires 1 or 2 arguments

Upvotes: 50

Views: 85416

Answers (3)

hanna-without-h
hanna-without-h

Reputation: 189

packageVersion("dyplr")
[1] '1.1.2'

Using df %>% mutate(across(2:7, round, 3)) gave the following warning message:

# ! The ... argument of across() is deprecated as of dplyr 1.1.0.
# Supply arguments directly to .fns through an anonymous function instead.

# Previously
across(a:b, mean, na.rm = TRUE)

# Now
across(a:b, \(x) mean(x, na.rm = TRUE))

Writing it as Arthur Yip suggested still works, but to prevent problems for future users, this would be an example solution on how to write the round function in this new syntax:

df %>% mutate(across(2:7, \(x) round(x, 3)))

Upvotes: 2

Arthur Yip
Arthur Yip

Reputation: 6230

While the new across() function is slightly more verbose than the previous mutate_if variant, the dplyr 1.0.0 updates make the tidyverse language and code more consistent and versatile.

This is how to round specified columns:

df %>% mutate(across(2:7, round, 3)) # columns 2-7 by position

df %>% mutate(across(cols, round, 3)) # columns specified by variable cols

This is how to round all numeric columns to 3 decimal places:

df %>% mutate(across(where(is.numeric), round, 3))

This is how to round all columns, but it won't work in this case because gene_symbol is not numeric:

df %>% mutate(across(everything(), round, 3))

Where we put where(is.numeric) in across's arguments, you could put in other column specifications such as -1 or -gene_symbol to exclude column 1. See help(tidyselect) for even more options.


Update for dplyr 1.0.0

The across() function replaces the _if/_all/_at/_each variants of dplyr verbs. https://dplyr.tidyverse.org/dev/articles/colwise.html#how-do-you-convert-existing-code


Old answer: Since some columns are not numeric, you could use mutate_if with the added benefit of rounding columns iff (if and only if) it is numeric:

df %>% mutate_if(is.numeric, round, 3)

Upvotes: 119

packageVersion("dplyr")
[1] '0.7.6'

Try

df %>% mutate_at(2:7, funs(round(., 3))) 

It works!!

Upvotes: 8

Related Questions