qnp1521
qnp1521

Reputation: 896

R round number to different number of digits based on value on multiple columns

I'm trying to round numbers in multiple columns using different thresholds based on the value. Specifically, I want to round to an integer if the absolute value is larger than 1 and round to the third decimal point if not. I've tried a few different strategies by following answers to similar questions but doesn't seem to work. Here's a reproducible example.

df <- structure(list(dep = c("cyl", "cyl", "disp", "disp", "drat", 
"drat", "hp", "hp", "mpg", "mpg"), name = c("estimate", "t_stat", 
"estimate", "t_stat", "estimate", "t_stat", "estimate", "t_stat", 
"estimate", "t_stat"), dat1 = c(1.15052520023357, 6.68591106097725, 
102.901631449292, 12.1072688820387, -0.422439347353398, -5.23657414425551, 
37.5762984208224, 5.06741973124599, -5.05739510901596, -8.18496613472796
), dat2 = c(1.27442224382304, 8.42316433209027, 106.428896001266, 
12.147509560065, -0.393755429958381, -5.30373672190043, 38.64345279421, 
6.17204732384094, -4.84272702226804, -10.6216411092441), dat3 = c(1.07794895749739, 
5.1912094236003, 103.687423254053, 7.78976856569243, -0.19357672324514, 
-2.62921011406252, 36.7770360009548, 4.84248650357675, -4.53918562415258, 
-7.91010248086649)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

According to the criteria, every number in column dat1 to dat3 should become an integer except for values in the fifth row. I've tried the following two approaches, but couldn't get it done.

df_raw %>% mutate_if( is.numeric(.) == T & abs(.) > 10, round, 0) 
Error in Math.data.frame(.) : 
  non-numeric variable(s) in data frame: dep, name

In the second approach, everything seems to work, but the fifth row is also rounded to 0 digits.

> df_raw %>% mutate_if( ~ is.numeric(.) == T && abs(.) > 1, round, 0) 
# A tibble: 10 x 5
  dep   name      dat1  dat2  dat3
 <chr> <chr>    <dbl> <dbl> <dbl>
1 cyl   estimate     1     1     1
2 cyl   t_stat       7     8     5
3 disp  estimate   103   106   104
4 disp  t_stat      12    12     8
5 drat  estimate     0     0     0
6 drat  t_stat      -5    -5    -3
7 hp    estimate    38    39    37
8 hp    t_stat       5     6     5
9 mpg   estimate    -5    -5    -5
10 mpg   t_stat      -8   -11    -8

My real problem involves many columns to mutate, so combining round with mutate_if (or something similar) is strongly preferred. Thanks!

Upvotes: 0

Views: 1846

Answers (2)

py_b
py_b

Reputation: 189

A correct syntax would be :

df_raw %>%
  mutate_if(
    is.numeric,
    ~ ifelse(abs(.x) > 1, round(.x), round(.x, 3))
  )

(second argument of mutate_if is a function, is.numeric here)

Upvotes: 1

henhesu
henhesu

Reputation: 851

Try the case_when function from the dplyr package for complex condition handling:

library(dplyr)

df %>% 
  mutate_at(.vars = vars(dat1, dat2, dat3),
            .funs = ~ case_when(abs(.x) > 1 ~ round(.x, digits = 0),
                                TRUE ~ round(.x, digits = 3)))

# A tibble: 10 x 5
   dep   name         dat1     dat2     dat3
   <chr> <chr>       <dbl>    <dbl>    <dbl>
 1 cyl   estimate    1        1        1    
 2 cyl   t_stat      7        8        5    
 3 disp  estimate  103      106      104    
 4 disp  t_stat     12       12        8    
 5 drat  estimate   -0.422   -0.394   -0.194
 6 drat  t_stat     -5       -5       -3    
 7 hp    estimate   38       39       37    
 8 hp    t_stat      5        6        5    
 9 mpg   estimate   -5       -5       -5    
10 mpg   t_stat     -8      -11       -8    

What we do here is that we mutate_at all three variables dat1 to dat3 (specified in the .vars argument) and call the case_when as a quosure style lambda function. This rounds each value to an integer (i.e., digits = 0) if the absolute value is larger than 1 and to a three-digit decimal float otherwise.

Side note:

While this approach is somewhat more verbose, it allows you to flexibly adjust both the variables which you want to apply the function to and add more complex conditions. If you are sure that you really only want to apply the function to numeric variables, you can of course use the mutate_if combined with a is.numeric predicate, but keep the case_when for the condition handling part:

df %>%
  mutate_if(.predicate = is.numeric,
            .funs = ~ case_when(abs(.x) > 1 ~ round(.x, digits = 0),
                                TRUE ~ round(.x, digits = 3)))

Upvotes: 1

Related Questions