Reputation: 21
I having some trouble with defining a function in R. Maybe someone with more experience could quickly help me:
sample data frame:
SALES <- c(21341,1241,5234)
EARNINGS <- c(12562,12356,12352)
df <- data.frame (SALES, EARNINGS)
I am interested in the deviation of a variable from its mean. This deviation (d) is calulated as follows.
p <- 0.1
m <- mean(df$SALES)
s <- sd(df$SALES)
d <- qnorm(1-p,mean=m,sd=s)-m
> d
[1] 13637.03
Now, I tried to execute this calulcation in a function with the following inputs: dataframe, variable (column) and p. But instead of 13637.03 I obtain NA as a result:
calculate.d <- function(x,y,p) {
m <- mean(x$y)
s <- sd(x$y)
d <- qnorm(1-p,mean=m,sd=s)-m
return(d)}
d <- calculate.d(df,SALES,0.1)
> d
[1] NA
why do the two formulations not give the same result? How do i have to adjust the function to get the desired result?
Upvotes: 0
Views: 68
Reputation: 6902
If your column name is stored in a variable (in your case a parameter), x$y
will not work as is (at least not to my knowledge). What you can do is use x[, y]
to retrieve the right column. Note that you should use a string (so "SALES"
, not SALES
):
calculate.d <- function(x, y, p) {
column <- x[, y]
m <- mean(column)
s <- sd(column)
d <- qnorm(1-p,mean=m,sd=s)-m
d
}
#notice that "SALES" is a string here
calculate.d(df, "SALES", 0.1)
Upvotes: 1