tall_table
tall_table

Reputation: 311

multiple ifelse statements and dplyr pipes, not recognizing 2nd object

What I'm trying to do: take columns from one data frame, recode them using ifelse statements, and move them to a new data frame, all the while using dplyr and pipes to do it in one shot.

Issue: The code works fine for the with just one column but I am running into problems once R encounters more than one column in same line of code. The second column is not recognized and R throws an error. I get the following error:

Error in mutate_impl(.data, dots) : Evaluation error: object 'var2_orig' not found.

Any thoughts on why this is? There may also be an easier way to this without using ifelse statements. I am open to suggestions on this front as well, but I am still curious about how to do this with ifelse and why the error with more than one column. Many thanks.

Sample code:

library(tidyverse)

# creating fake data set
df_orig <- data_frame(var1_orig = sample(1:3, 50, replace = T),
                  var2_orig = sample(-2:2, 50, replace = T))

# works for one var (recoding 3's as NA, 2's as 1, and 1's as 0):

df_new <- df_orig %>%
  as_tibble() %>%
  transmute(var1_new = ifelse(var1_orig == 3, NA, ifelse(var1_orig == 
  2, 1, ifelse(var1_orig == 1, 0, var1_orig))))

# and works the other var (recoding negatives as NA, 1's and 2's as 1, and 
leaving 0's as 0's):

df_new <- df_orig %>%
  as_tibble() %>%
  transmute(var2_new = ifelse(var2_orig < 0, NA, ifelse(var2_orig == 
  1 | var2_orig == 2, 1, 0)))

# but not together in same line of code (error: var2_orig not recognized):

df_new <- df_orig %>%
  as_tibble() %>%
  transmute(var1_new = ifelse(var1_orig == 3, NA, ifelse(var1_orig == 
  2, 1, ifelse(var1_orig == 1, 0, var1_orig)))) %>%
  transmute(var2_new = ifelse(var2_orig < 0, NA, ifelse(var2_orig == 
  1 | var2_orig == 2, 1, 0)))

Upvotes: 4

Views: 6721

Answers (2)

GGamba
GGamba

Reputation: 13680

The dplyr's verb transmute keeps only the variables you create, and drop the original varabiles, so var2_orig is not present anymore for the second transmute.

You can either create the two variable in the same transmute call, use mutate, and then drop the original variable if not needed.

By the way, case_when would be useful here:

df_orig %>% 
  transmute(var1_new = case_when(var1_orig == 3 ~ NA_integer_,
                              var1_orig == 2 ~ 1L,
                              TRUE ~ var1_orig),
            var2_new = case_when(var2_orig < 0 ~ NA_integer_,
                              var2_orig %in% 1:2 ~ 1L,
                              TRUE ~ 0L)
  )
#> # A tibble: 50 x 2
#>    var1_new var2_new
#>       <int>    <int>
#>  1        1        1
#>  2        1        1
#>  3        1        0
#>  4       NA       NA
#>  5       NA        0
#>  6        1       NA
#>  7        1        1
#>  8        1        1
#>  9        1        1
#> 10        1        1
#> # ... with 40 more rows

Upvotes: 5

Sotos
Sotos

Reputation: 51582

Transmute will drop the variables, hence the error. You can use mutate to update the existing variables and then rename_all (If needed) to change their names,

df_orig %>%
    as_tibble() %>%
    mutate(var1_orig = ifelse(var1_orig == 3, NA, ifelse(var1_orig == 2, 1,
                                                         ifelse(var1_orig == 1, 0, var1_orig))),
           var2_orig = ifelse(var2_orig < 0, NA, ifelse(var2_orig == 1 | var2_orig == 2, 1, 0))) %>%
    rename_all(funs(sub('_.*', '_new', .)))

Upvotes: 5

Related Questions