Reputation: 311
What I'm trying to do: take columns from one data frame, recode them using ifelse statements, and move them to a new data frame, all the while using dplyr and pipes to do it in one shot.
Issue: The code works fine for the with just one column but I am running into problems once R encounters more than one column in same line of code. The second column is not recognized and R throws an error. I get the following error:
Error in mutate_impl(.data, dots) : Evaluation error: object 'var2_orig' not found.
Any thoughts on why this is? There may also be an easier way to this without using ifelse statements. I am open to suggestions on this front as well, but I am still curious about how to do this with ifelse and why the error with more than one column. Many thanks.
Sample code:
library(tidyverse)
# creating fake data set
df_orig <- data_frame(var1_orig = sample(1:3, 50, replace = T),
var2_orig = sample(-2:2, 50, replace = T))
# works for one var (recoding 3's as NA, 2's as 1, and 1's as 0):
df_new <- df_orig %>%
as_tibble() %>%
transmute(var1_new = ifelse(var1_orig == 3, NA, ifelse(var1_orig ==
2, 1, ifelse(var1_orig == 1, 0, var1_orig))))
# and works the other var (recoding negatives as NA, 1's and 2's as 1, and
leaving 0's as 0's):
df_new <- df_orig %>%
as_tibble() %>%
transmute(var2_new = ifelse(var2_orig < 0, NA, ifelse(var2_orig ==
1 | var2_orig == 2, 1, 0)))
# but not together in same line of code (error: var2_orig not recognized):
df_new <- df_orig %>%
as_tibble() %>%
transmute(var1_new = ifelse(var1_orig == 3, NA, ifelse(var1_orig ==
2, 1, ifelse(var1_orig == 1, 0, var1_orig)))) %>%
transmute(var2_new = ifelse(var2_orig < 0, NA, ifelse(var2_orig ==
1 | var2_orig == 2, 1, 0)))
Upvotes: 4
Views: 6721
Reputation: 13680
The dplyr's verb transmute
keeps only the variables you create, and drop the original varabiles, so var2_orig
is not present anymore for the second transmute
.
You can either create the two variable in the same transmute
call, use mutate
, and then drop the original variable if not needed.
By the way, case_when
would be useful here:
df_orig %>%
transmute(var1_new = case_when(var1_orig == 3 ~ NA_integer_,
var1_orig == 2 ~ 1L,
TRUE ~ var1_orig),
var2_new = case_when(var2_orig < 0 ~ NA_integer_,
var2_orig %in% 1:2 ~ 1L,
TRUE ~ 0L)
)
#> # A tibble: 50 x 2
#> var1_new var2_new
#> <int> <int>
#> 1 1 1
#> 2 1 1
#> 3 1 0
#> 4 NA NA
#> 5 NA 0
#> 6 1 NA
#> 7 1 1
#> 8 1 1
#> 9 1 1
#> 10 1 1
#> # ... with 40 more rows
Upvotes: 5
Reputation: 51582
Transmute will drop the variables, hence the error. You can use mutate
to update the existing variables and then rename_all
(If needed) to change their names,
df_orig %>%
as_tibble() %>%
mutate(var1_orig = ifelse(var1_orig == 3, NA, ifelse(var1_orig == 2, 1,
ifelse(var1_orig == 1, 0, var1_orig))),
var2_orig = ifelse(var2_orig < 0, NA, ifelse(var2_orig == 1 | var2_orig == 2, 1, 0))) %>%
rename_all(funs(sub('_.*', '_new', .)))
Upvotes: 5