Reputation: 175
I'm attempting to write a function that uses the prob package to compute conditional probabilities. When using the function I continue to encounter the same error, which states an object within the function cannot be found.
Below is a reproducible example in which I compute a conditional probability without the function and then attempt to use the function to produce the same result. I'm not sure if the error is due to limitations with the prob package or an error on my part.
# Load prob package
library(prob)
# Set seed for reproducibility
set.seed(30)
# Sample data frame
sampledata <- data.frame(
X <- sample(1:10),
Y <- sample(c(-1, 0, 1), 10, replace=TRUE))
# Set probability space
S <- probspace(sampledata)
# Subset Y between -1 and 0
A <- subset(S, Y>=-1 & Y<=0)
# Subset X greater than 6
B <- subset(S, X>6)
# Compute conditional probability
P <- prob(A, given=B)
The above code produces the following probability:
> P
[1] 0.25
Attempting to write a function to calculate the same probability:
# Create function with data frame, variables, and conditional inputs
prob.function <- function(df, variable1, variable2, state1, state2, cond1){
s <- probspace(df)
a <- subset(s, variable1>=state1 & variable1<=state2)
b <- subset(s, variable2>cond1)
p <- prob(a, given=b)
return(p)
}
# Demonstrate the function
test <- prob.function(sampledata, Y, X, -1, 0, 6)
This function gives the following error:
Error in eval(expr, envir, enclos) : object 'b' not found
Any help you can provide would be great.
Thanks!
Upvotes: 2
Views: 2616
Reputation: 12819
I don't think this is a bug in package prob
.
First, you should create you sampledata
as
sampledata <- data.frame(
X = sample(1:10),
Y = sample(c(-1, 0, 1), 10, replace=TRUE))
Your original code creates not only this dataframe but also variables X
and Y
in the global environment which are actually being used later when you call your function.
Second, you shouldn't call subset()
inside a function. Use bracket subsetting instead:
prob.function <- function(df, variable1, variable2, state1, state2, cond1){
s <- probspace(df)
a <- s[s[[variable1]]>=state1 & s[[variable1]]<=state2, ]
b <- s[s[[variable2]]>cond1, ]
p <- prob(a, given=b)
return(p)
}
And pass variable1
and variable2
as strings:
test <- prob.function(sampledata, "Y", "X", -1, 0, 6)
Now you have test==0.25
, and no error.
References for what is going on:
Upvotes: 1
Reputation: 55350
This looks like a bug in prob
.
When I run this in Vanilla R, I get the same error. But when I create an object b
in my workspace, the error disapears:
> print(b)
Error in print(b) : object 'b' not found
> test <- prob.function(sampledata, Y, X, -1, 0, 6)
Error in eval(expr, envir, enclos) : object 'b' not found
>
> b <- "dummy variable"
> print(b)
[1] "dummy variable"
> test <- prob.function(sampledata, Y, X, -1, 0, 6)
> test
[1] 0.25
>
As a temporary workaround, just create a dummy b
in your current environment.
As for the bug, if you look at the source for prob.default
(which in the example above is what prob(a, given=b)
is eventually calling), you'll see the following section:
if (missing(given)) {
< cropped >
}
else {
f <- substitute(given)
g <- eval(f, x) <~~~~
if (!is.logical(g)) { <~~~~
if (!is.data.frame(given)) <~~~~
stop("'given' must be data.frame or evaluate to logical")
B <- given
}
...
< cropped >
}
it is jumping from g
to given
, perhaps inadvertently? I would reach out to the package maintainer, as this may be an oversight.
Upvotes: 2