Reputation: 61
I have some variables var1, var2, ..., var100
I would like to create new variables var1_trun, var2_trun, ..., var100_trun
which should have the same values as var1, var2, ..., var100
except the values above the 90%-percentile. Those values should be set equal to the 90%-percentile of the original vairables.
What is the best way to accomplish this?
I tried:
trun <- function(x) {
assign(paste0(substitute(x),"_trun"))<<-x
assign(paste0(substitute(x),"_trun"))[x>quantile(x, probs=seq(0,1,0.05))[19]]<<-quantile(x, probs=seq(0,1,0.05))[19]
}
trun(data$var1)
I get:
Error in assign(paste0(substitute(x), "_trun")) <<- x :
object 'x' not found.
Upvotes: 4
Views: 3176
Reputation: 6165
how about this?
Assumption: your variables are in a named list:
x<-c(1:10)
y<-c(10:100)
vars <- list(x,y)
names(vars)=c("x","y")
Then you could do:
# preparing the variables
x<-c(1:10)
y<-c(10:100)
vars <- list(x,y)
names(vars)=c("x","y")
# original and truncated variable names
varlist <- c("x", "y")
trunname <- function(x){ paste0(x, "_trun") }
# truncate a vector: all values < 90% percentile remain unchanged, others:=(90% percentile)
trun <- function(x){ ifelse(x<=quantile(x,0.9),x,quantile(x,0.9)) }
# truncate each element of a list
vars_trun <- lapply(vars, function(x){ trun(x) })
# rename the truncated variables
names(vars_trun) <- trunname(varlist)
Output:
$x
[1] 1 2 3 4 5 6 7 8 9 10
$y
[1] 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
[37] 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
[73] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
$x_trun
[1] 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 9.1
$y_trun
[1] 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
[49] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 91 91 91 91 91 91 91 91 91
Upvotes: 0
Reputation: 545588
This is really the wrong approach. Don’t create variables named like this (and, just to clarify: what you actually call them is relatively unimportant; what’s important is that you have data of the same general shape — this data belongs grouped into a homogeneous container). Maintain one variable that’s a list, a vector or a matrix (depending on your data).
This will vastly simplify your code.
That said, your code has a very straightforward error: instead of assign(…) <<- x
, you need to do assign(…, x)
, and specify the target environment. So, in your case:
assign(paste(substitute(x), "trun", sep = "_"), x, envir = parent.frame())
Upvotes: 7