Reputation: 163
I would like to create new empty columns with dplyr, based on a vector containing the new variables names. The columns I would like to create will contain only "0" for example. It is probably very easy but I can't find a solution. Ideally, i would use dplyr::mutates function (mutate, mutate_at...), because I need a solution that also works with spark dataframes. I would normally use mutate_at() but it only works if the columns already exist.
I can't use tibble::add_column(). I unfortunately need a solution that also works with spark dataframes
library("dplyr")
# --- trying to create 3 new columns
# I would like to do something like this :
new_vars = c("var1", "var2", "var3")
mtcars %>%
mutate_at(.vars = new_vars, .funs = ~ 0)
Error in `tbl_at_vars()`:
! Can't subset columns that don't exist.
✖ Column `var1` doesn't exist.
mtcars %>%
mutate_at(.vars = c('mpg', 'cyl'), .funs = ~ 0)
new_vars = c("var1", "var2", "var3")
mtcars[, new_vars] = 0
library("data.table")
new_vars = c("var1", "var2", "var3")
mtcars_dt = as.data.table(mtcars)
mtcars_dt[, (new_vars) := 0]
Thank you.
Upvotes: 0
Views: 616
Reputation: 160387
I don't have spark
available, but does an arrow
connection closely-enough approximate what is needed?
arr <- arrow::arrow_table(mtcars)
Reduce(function(dat, this) mutate(dat, {{this}} := 0), new_vars, init=arr) %>%
collect()
# # A tibble: 32 × 14
# mpg cyl disp hp drat wt qsec vs am gear carb var1 var2 var3
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 0 0 0
# 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 0 0 0
# 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 0 0 0
# 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 0 0 0
# 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 0 0 0
# 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 0 0 0
# 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 0 0 0
# 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 0 0 0
# 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 0 0 0
# 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4 0 0 0
# # ℹ 22 more rows
# # ℹ Use `print(n = ...)` to see more rows
Upvotes: 1