Eric Fail
Eric Fail

Reputation: 7928

cut() and label eveything in tibble by the same breaks and labels

I got a that is 8984 times 155 where I need to cut() and label all all columns in the same way, i.e. using the same cut and the same labels to create a new labeled . How do I do this in a simple way?

Here a 3 times 3 to simulate my 8984 times 155

# install.packages(c("tidyverse", "lubridate"), dependencies = TRUE)
require(tidyverse)
df <- tibble(x = 1:3, y = c(4, NA, 6))
df <- df %>% mutate(iD = row_number())
#> # A tibble: 3 x 3
#>       x     y    iD
#>   <int> <dbl> <int>
#> 1     1  4.00     1
#> 2     2 NA        2
#> 3     3  6.00     3

Now, I currently label it like this, I realize I can creta a breaks objcet and a labels object and reuse them, but isen't there a way in whch I can away repeating the mutate() call?

df_labeled <-  df %>% mutate(x = cut(x, breaks = c(-Inf,1,3,6),
   labels = c('Low', 'middle', 'high'), include.lowest = TRUE),
                             y = cut(y, breaks = c(-Inf,1,3,6),
   labels = c('Low', 'middle', 'high'), include.lowest = TRUE)) %>% 
                                                                 select(iD, x, y)

This gives me what I want, but I am looking for a more general way.

df_labeled
#> # A tibble: 3 x 3
#>      iD x      y    
#>   <int> <fct>  <fct>
#> 1     1 Low    high 
#> 2     2 middle <NA> 
#> 3     3 middle high

p.s. am I the only one who gets an error when I call my id variable id?

Inspired by jazzurro's comment I am currently experimenting with this

df %>% mutate_at(vars(-iD),cut(as.numeric(.), breaks = c(-Inf,1,3,6), 
            labels = c('Low', 'middle', 'high'), include.lowest = TRUE)) 

but I still get an error,

Error in cut(as.numeric(.), breaks = c(-Inf, 1, 3, 6), labels = c("Low",  : 
  (list) object cannot be coerced to type 'double'

I'm currently reading teh manual to figuring that one out.

Upvotes: 0

Views: 450

Answers (1)

duckmayr
duckmayr

Reputation: 16910

Your difficulty applying jazzurro's comment is because you don't need as.numeric(.):

df %>%
    mutate_at(vars(-iD), cut, breaks = c(-Inf, 1, 3, 6), include.lowest = TRUE,
              labels = c('Low', 'middle', 'high'))

# A tibble: 3 x 3
       x      y    iD
  <fctr> <fctr> <int>
1    Low   high     1
2 middle   <NA>     2
3 middle   high     3

Upvotes: 1

Related Questions